perticarari commited on
Commit
008a8d1
·
verified ·
1 Parent(s): 3d96838

Initial commit

Browse files
Files changed (4) hide show
  1. README.md +3 -65
  2. custom_trans.py +46 -0
  3. model.safetensors +1 -1
  4. modules.json +1 -1
README.md CHANGED
@@ -50,36 +50,6 @@ datasets:
50
  - mteb/sts12-sts
51
  pipeline_tag: sentence-similarity
52
  library_name: sentence-transformers
53
- metrics:
54
- - pearson_cosine
55
- - spearman_cosine
56
- - cosine_accuracy
57
- model-index:
58
- - name: SentenceTransformer
59
- results:
60
- - task:
61
- type: semantic-similarity
62
- name: Semantic Similarity
63
- dataset:
64
- name: Unknown
65
- type: unknown
66
- metrics:
67
- - type: pearson_cosine
68
- value: 0.2502604111969662
69
- name: Pearson Cosine
70
- - type: spearman_cosine
71
- value: 0.2861642394156719
72
- name: Spearman Cosine
73
- - task:
74
- type: triplet
75
- name: Triplet
76
- dataset:
77
- name: Unknown
78
- type: unknown
79
- metrics:
80
- - type: cosine_accuracy
81
- value: 0.844
82
- name: Cosine Accuracy
83
  ---
84
 
85
  # SentenceTransformer
@@ -170,27 +140,6 @@ You can finetune this model on your own dataset.
170
  *List how the model may foreseeably be misused and address what users ought not to do with the model.*
171
  -->
172
 
173
- ## Evaluation
174
-
175
- ### Metrics
176
-
177
- #### Semantic Similarity
178
-
179
- * Evaluated with [<code>EmbeddingSimilarityEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.EmbeddingSimilarityEvaluator)
180
-
181
- | Metric | Value |
182
- |:--------------------|:-----------|
183
- | pearson_cosine | 0.2503 |
184
- | **spearman_cosine** | **0.2862** |
185
-
186
- #### Triplet
187
-
188
- * Evaluated with [<code>TripletEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.TripletEvaluator)
189
-
190
- | Metric | Value |
191
- |:--------------------|:----------|
192
- | **cosine_accuracy** | **0.844** |
193
-
194
  <!--
195
  ## Bias, Risks and Limitations
196
 
@@ -264,7 +213,7 @@ You can finetune this model on your own dataset.
264
  - `per_device_train_batch_size`: 32
265
  - `per_device_eval_batch_size`: 32
266
  - `learning_rate`: 1e-05
267
- - `num_train_epochs`: 10
268
 
269
  #### All Hyperparameters
270
  <details><summary>Click to expand</summary>
@@ -286,7 +235,7 @@ You can finetune this model on your own dataset.
286
  - `adam_beta2`: 0.999
287
  - `adam_epsilon`: 1e-08
288
  - `max_grad_norm`: 1.0
289
- - `num_train_epochs`: 10
290
  - `max_steps`: -1
291
  - `lr_scheduler_type`: linear
292
  - `lr_scheduler_kwargs`: {}
@@ -387,24 +336,13 @@ You can finetune this model on your own dataset.
387
 
388
  </details>
389
 
390
- ### Training Logs
391
- | Epoch | Step | Training Loss | Validation Loss | spearman_cosine | cosine_accuracy |
392
- |:-----:|:----:|:-------------:|:---------------:|:---------------:|:---------------:|
393
- | 3.125 | 100 | 6.523 | 6.3663 | 0.2497 | - |
394
- | 6.25 | 200 | 6.0248 | 6.3467 | 0.2702 | - |
395
- | 9.375 | 300 | 5.8616 | 6.3936 | 0.2862 | - |
396
- | 3.125 | 100 | 2.1251 | 1.2034 | - | 0.854 |
397
- | 6.25 | 200 | 1.6618 | 1.2496 | - | 0.843 |
398
- | 9.375 | 300 | 1.6239 | 1.2676 | - | 0.844 |
399
-
400
-
401
  ### Framework Versions
402
  - Python: 3.10.12
403
  - Sentence Transformers: 3.3.1
404
  - Transformers: 4.46.2
405
  - PyTorch: 2.5.1+cu121
406
  - Accelerate: 1.1.1
407
- - Datasets: 3.1.0
408
  - Tokenizers: 0.20.3
409
 
410
  ## Citation
 
50
  - mteb/sts12-sts
51
  pipeline_tag: sentence-similarity
52
  library_name: sentence-transformers
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
53
  ---
54
 
55
  # SentenceTransformer
 
140
  *List how the model may foreseeably be misused and address what users ought not to do with the model.*
141
  -->
142
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
143
  <!--
144
  ## Bias, Risks and Limitations
145
 
 
213
  - `per_device_train_batch_size`: 32
214
  - `per_device_eval_batch_size`: 32
215
  - `learning_rate`: 1e-05
216
+ - `num_train_epochs`: 1
217
 
218
  #### All Hyperparameters
219
  <details><summary>Click to expand</summary>
 
235
  - `adam_beta2`: 0.999
236
  - `adam_epsilon`: 1e-08
237
  - `max_grad_norm`: 1.0
238
+ - `num_train_epochs`: 1
239
  - `max_steps`: -1
240
  - `lr_scheduler_type`: linear
241
  - `lr_scheduler_kwargs`: {}
 
336
 
337
  </details>
338
 
 
 
 
 
 
 
 
 
 
 
 
339
  ### Framework Versions
340
  - Python: 3.10.12
341
  - Sentence Transformers: 3.3.1
342
  - Transformers: 4.46.2
343
  - PyTorch: 2.5.1+cu121
344
  - Accelerate: 1.1.1
345
+ - Datasets: 2.21.0
346
  - Tokenizers: 0.20.3
347
 
348
  ## Citation
custom_trans.py ADDED
@@ -0,0 +1,46 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import torch
2
+ import torch.nn as nn
3
+ from sentence_transformers import models
4
+
5
+ class CustTrans(models.Transformer):
6
+
7
+ def __init__(self, *args, **kwargs):
8
+ super().__init__(*args, **kwargs)
9
+ self.curr_task_type = None
10
+ self._rebuild_taskembedding(['sts', 'quora'])
11
+
12
+ def forward(self, inputs, task_type=None):
13
+
14
+ enc = self.auto_model(**inputs).last_hidden_state
15
+
16
+ if task_type == None:
17
+ task_type = self.curr_task_type
18
+
19
+ if task_type in self.task_types:
20
+ idx = torch.tensor(self.task_types.index(task_type), device=self.TaskEmbedding.weight.device)
21
+ hyp = self.TaskEmbedding(idx)
22
+ inputs['token_embeddings'] = self._project(enc, hyp)
23
+
24
+ else:
25
+ inputs['token_embeddings'] = enc
26
+
27
+ return inputs
28
+
29
+ def _set_curr_task_type(self, task_type):
30
+ self.curr_task_type = task_type
31
+
32
+ def _set_taskembedding_grad(self, value):
33
+ self.TaskEmbedding.weight.requires_grad = value
34
+
35
+ def _set_transformer_grad(self, value):
36
+ for param in self.auto_model.parameters():
37
+ param.requires_grad = value
38
+
39
+ def _rebuild_taskembedding(self, task_types):
40
+ self.task_types = task_types
41
+ self.task_emb = 1 - torch.eye(len(self.task_types),768)
42
+ self.TaskEmbedding = nn.Embedding(len(self.task_types), 768).from_pretrained(self.task_emb)
43
+
44
+ def _project(self, v, normal_hyper):
45
+ # return v - torch.dot(v, normal_hyper)*normal_hyper / torch.norm(normal_hyper)**2
46
+ return v*normal_hyper
model.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:f3f5cb0e45d40583bb1696b6113195cb1a34650541ded587f9e9cdee78985248
3
  size 437951328
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:8e47716a979def3ee4331621abb95a2a07619cf6428ca798c051201cbbc0ff89
3
  size 437951328
modules.json CHANGED
@@ -3,7 +3,7 @@
3
  "idx": 0,
4
  "name": "0",
5
  "path": "",
6
- "type": "__main__.CustTrans"
7
  },
8
  {
9
  "idx": 1,
 
3
  "idx": 0,
4
  "name": "0",
5
  "path": "",
6
+ "type": "custom_trans.CustTrans"
7
  },
8
  {
9
  "idx": 1,