Added google-bert-base based cross encoder checkpoint

#1
crossencoder-checkpoints/checkpoint-googlebert-10000/README.md ADDED
@@ -0,0 +1,523 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language:
3
+ - en
4
+ license: apache-2.0
5
+ tags:
6
+ - sentence-transformers
7
+ - cross-encoder
8
+ - generated_from_trainer
9
+ - dataset_size:330152
10
+ - loss:CachedMultipleNegativesRankingLoss
11
+ base_model: google-bert/bert-base-cased
12
+ pipeline_tag: text-ranking
13
+ library_name: sentence-transformers
14
+ metrics:
15
+ - map
16
+ - mrr@10
17
+ - ndcg@10
18
+ model-index:
19
+ - name: CrossEncoder based on google-bert/bert-base-cased
20
+ results:
21
+ - task:
22
+ type: cross-encoder-reranking
23
+ name: Cross Encoder Reranking
24
+ dataset:
25
+ name: dev
26
+ type: dev
27
+ metrics:
28
+ - type: map
29
+ value: 0.9441464035183368
30
+ name: Map
31
+ - type: mrr@10
32
+ value: 0.9441464035183368
33
+ name: Mrr@10
34
+ - type: ndcg@10
35
+ value: 0.9703932950632154
36
+ name: Ndcg@10
37
+ - task:
38
+ type: cross-encoder-reranking
39
+ name: Cross Encoder Reranking
40
+ dataset:
41
+ name: NanoNQ R100
42
+ type: NanoNQ_R100
43
+ metrics:
44
+ - type: map
45
+ value: 0.2676
46
+ name: Map
47
+ - type: mrr@10
48
+ value: 0.304
49
+ name: Mrr@10
50
+ - type: ndcg@10
51
+ value: 0.3307
52
+ name: Ndcg@10
53
+ - task:
54
+ type: cross-encoder-reranking
55
+ name: Cross Encoder Reranking
56
+ dataset:
57
+ name: NanoSCIDOCS R100
58
+ type: NanoSCIDOCS_R100
59
+ metrics:
60
+ - type: map
61
+ value: 0.2425
62
+ name: Map
63
+ - type: mrr@10
64
+ value: 0.5271
65
+ name: Mrr@10
66
+ - type: ndcg@10
67
+ value: 0.2968
68
+ name: Ndcg@10
69
+ - task:
70
+ type: cross-encoder-reranking
71
+ name: Cross Encoder Reranking
72
+ dataset:
73
+ name: NanoSciFact R100
74
+ type: NanoSciFact_R100
75
+ metrics:
76
+ - type: map
77
+ value: 0.6758
78
+ name: Map
79
+ - type: mrr@10
80
+ value: 0.6809
81
+ name: Mrr@10
82
+ - type: ndcg@10
83
+ value: 0.7085
84
+ name: Ndcg@10
85
+ - task:
86
+ type: cross-encoder-nano-beir
87
+ name: Cross Encoder Nano BEIR
88
+ dataset:
89
+ name: NanoBEIR R100 mean
90
+ type: NanoBEIR_R100_mean
91
+ metrics:
92
+ - type: map
93
+ value: 0.3953
94
+ name: Map
95
+ - type: mrr@10
96
+ value: 0.504
97
+ name: Mrr@10
98
+ - type: ndcg@10
99
+ value: 0.4453
100
+ name: Ndcg@10
101
+ ---
102
+
103
+ # CrossEncoder based on google-bert/bert-base-cased
104
+
105
+ This is a [Cross Encoder](https://www.sbert.net/docs/cross_encoder/usage/usage.html) model finetuned from [google-bert/bert-base-cased](https://huggingface.co/google-bert/bert-base-cased) using the [sentence-transformers](https://www.SBERT.net) library. It computes scores for pairs of texts, which can be used for text reranking and semantic search.
106
+
107
+ ## Model Details
108
+
109
+ ### Model Description
110
+ - **Model Type:** Cross Encoder
111
+ - **Base model:** [google-bert/bert-base-cased](https://huggingface.co/google-bert/bert-base-cased) <!-- at revision cd5ef92a9fb2f889e972770a36d4ed042daf221e -->
112
+ - **Maximum Sequence Length:** 512 tokens
113
+ - **Number of Output Labels:** 1 label
114
+ <!-- - **Training Dataset:** Unknown -->
115
+ - **Language:** en
116
+ - **License:** apache-2.0
117
+
118
+ ### Model Sources
119
+
120
+ - **Documentation:** [Sentence Transformers Documentation](https://sbert.net)
121
+ - **Documentation:** [Cross Encoder Documentation](https://www.sbert.net/docs/cross_encoder/usage/usage.html)
122
+ - **Repository:** [Sentence Transformers on GitHub](https://github.com/UKPLab/sentence-transformers)
123
+ - **Hugging Face:** [Cross Encoders on Hugging Face](https://huggingface.co/models?library=sentence-transformers&other=cross-encoder)
124
+
125
+ ## Usage
126
+
127
+ ### Direct Usage (Sentence Transformers)
128
+
129
+ First install the Sentence Transformers library:
130
+
131
+ ```bash
132
+ pip install -U sentence-transformers
133
+ ```
134
+
135
+ Then you can load this model and run inference.
136
+ ```python
137
+ from sentence_transformers import CrossEncoder
138
+
139
+ # Download from the 🤗 Hub
140
+ model = CrossEncoder("cross_encoder_model_id")
141
+ # Get scores for pairs of texts
142
+ pairs = [
143
+ ['neuropeptides in small intestine enteric system', 'the enteric system in the gut wall (figure 6–2) is the most extensively studied system containing nanc neurons in addition to cholinergic and adrenergic fibers. in the small intestine, for example, these neurons contain one or more of the following: nitric oxide synthase (which produces nitric oxide, no), calcitonin gene-related peptide, cholecystokinin, dynorphin, enkephalins, gastrin-releasing peptide, 5-hydroxytryptamine (5-ht, serotonin), neuropeptide y, somatostatin, substance p, and vasoactive intestinal peptide (vip). some neurons contain as many as five different transmitters.'],
144
+ ['how does the timing of rubella virus infection during pregnancy affect the outcome for the fetus?', 'congenital rubella syndrome the most serious consequence of rubella virus infection can develop when a woman becomes infected during pregnancy, particularly during the first trimester. the resulting complications may include miscarriage, fetal death, premature delivery, or live birth with congenital defects. infants infected with rubella virus in utero may have myriad physical defects (table 230e-1), which most commonly relate to the eyes, ears, and heart. this constellation of severe birth defects is known as congenital rubella syndrome. in addition to permanent manifestations, there are a host of transient physical manifestations, including thrombocytopenia with purpura/petechiae (e.g., dermal erythropoiesis, “blueberry muffin syndrome”). some infants may be born with congenital rubella virus infection but have no apparent signs or symptoms of crs and are referred to as “infants with congenital rubella infection only.”'],
145
+ ['structure and function of β barrels in membrane proteins', 'most multipass membrane proteins in eukaryotic cells and in the bacterial plasma membrane are constructed from transmembrane α helices. the helices figure 10–22 steps in the folding of a multipass transmembrane protein. when a newly synthesized transmembrane α helix is released into the lipid bilayer, it is initially surrounded by lipid molecules. as the protein folds, contacts between the helices displace some of the lipid molecules surrounding the helices. figure 10–23 β barrels formed from different numbers of β strands.'],
146
+ ['c1 complex activation in classical complement pathway', 'c1 complex, c1 protein complex activated as the first step in the classical pathway of complement activation, composed of c1q bound to two molecules each of the proteases c1r and c1s. binding of a pathogen or antibody to c1q activates c1r, which cleaves and activates c1s, which cleaves c4 and c2. c1 inhibitor (c1inh) an inhibitor protein for c1 that binds and inactivates c1r:c1s enzymatic activity. deficiency in c1inh causes hereditary angioedema through production of vasoactive peptides that cause subcutaneous and laryngeal swelling. c2 complement protein of the classical and lectin pathways that is cleaved by the c1 complex to yield c2b and c2a. c2a is an active protease that forms part of the classical c3 convertase c4bc2a. c3 complement protein on which all complement activation pathways converge. c3 cleavage forms c3b, which can bind covalently to microbial surfaces, where it promotes destruction by phagocytes.'],
147
+ ['eortc trial neoadjuvant chemotherapy advanced ovarian cancer', 'the eortc completed a large randomized trial in 718 patients with advanced ovarian cancer comparing initial surgery followed by six cycles of carboplatin and paclitaxel with three cycles of neoadjuvant chemotherapy followed by surgical debulking and another three cycles of chemotherapy. the study found that the progression-free survival was identical in both arms (12 months) and similarly the overall survival (30 months) was the same in both arms (221). the morbidity of surgery was significantly less in patients receiving neoadjuvant chemotherapy, suggesting that in selected patients with very advanced (stages iiic and iv) ovarian cancer two to three cycles of neoadjuvant chemotherapy prior to surgical debulking is a reasonable option.'],
148
+ ]
149
+ scores = model.predict(pairs)
150
+ print(scores.shape)
151
+ # (5,)
152
+
153
+ # Or rank different texts based on similarity to a single text
154
+ ranks = model.rank(
155
+ 'neuropeptides in small intestine enteric system',
156
+ [
157
+ 'the enteric system in the gut wall (figure 6–2) is the most extensively studied system containing nanc neurons in addition to cholinergic and adrenergic fibers. in the small intestine, for example, these neurons contain one or more of the following: nitric oxide synthase (which produces nitric oxide, no), calcitonin gene-related peptide, cholecystokinin, dynorphin, enkephalins, gastrin-releasing peptide, 5-hydroxytryptamine (5-ht, serotonin), neuropeptide y, somatostatin, substance p, and vasoactive intestinal peptide (vip). some neurons contain as many as five different transmitters.',
158
+ 'congenital rubella syndrome the most serious consequence of rubella virus infection can develop when a woman becomes infected during pregnancy, particularly during the first trimester. the resulting complications may include miscarriage, fetal death, premature delivery, or live birth with congenital defects. infants infected with rubella virus in utero may have myriad physical defects (table 230e-1), which most commonly relate to the eyes, ears, and heart. this constellation of severe birth defects is known as congenital rubella syndrome. in addition to permanent manifestations, there are a host of transient physical manifestations, including thrombocytopenia with purpura/petechiae (e.g., dermal erythropoiesis, “blueberry muffin syndrome”). some infants may be born with congenital rubella virus infection but have no apparent signs or symptoms of crs and are referred to as “infants with congenital rubella infection only.”',
159
+ 'most multipass membrane proteins in eukaryotic cells and in the bacterial plasma membrane are constructed from transmembrane α helices. the helices figure 10–22 steps in the folding of a multipass transmembrane protein. when a newly synthesized transmembrane α helix is released into the lipid bilayer, it is initially surrounded by lipid molecules. as the protein folds, contacts between the helices displace some of the lipid molecules surrounding the helices. figure 10–23 β barrels formed from different numbers of β strands.',
160
+ 'c1 complex, c1 protein complex activated as the first step in the classical pathway of complement activation, composed of c1q bound to two molecules each of the proteases c1r and c1s. binding of a pathogen or antibody to c1q activates c1r, which cleaves and activates c1s, which cleaves c4 and c2. c1 inhibitor (c1inh) an inhibitor protein for c1 that binds and inactivates c1r:c1s enzymatic activity. deficiency in c1inh causes hereditary angioedema through production of vasoactive peptides that cause subcutaneous and laryngeal swelling. c2 complement protein of the classical and lectin pathways that is cleaved by the c1 complex to yield c2b and c2a. c2a is an active protease that forms part of the classical c3 convertase c4bc2a. c3 complement protein on which all complement activation pathways converge. c3 cleavage forms c3b, which can bind covalently to microbial surfaces, where it promotes destruction by phagocytes.',
161
+ 'the eortc completed a large randomized trial in 718 patients with advanced ovarian cancer comparing initial surgery followed by six cycles of carboplatin and paclitaxel with three cycles of neoadjuvant chemotherapy followed by surgical debulking and another three cycles of chemotherapy. the study found that the progression-free survival was identical in both arms (12 months) and similarly the overall survival (30 months) was the same in both arms (221). the morbidity of surgery was significantly less in patients receiving neoadjuvant chemotherapy, suggesting that in selected patients with very advanced (stages iiic and iv) ovarian cancer two to three cycles of neoadjuvant chemotherapy prior to surgical debulking is a reasonable option.',
162
+ ]
163
+ )
164
+ # [{'corpus_id': ..., 'score': ...}, {'corpus_id': ..., 'score': ...}, ...]
165
+ ```
166
+
167
+ <!--
168
+ ### Direct Usage (Transformers)
169
+
170
+ <details><summary>Click to see the direct usage in Transformers</summary>
171
+
172
+ </details>
173
+ -->
174
+
175
+ <!--
176
+ ### Downstream Usage (Sentence Transformers)
177
+
178
+ You can finetune this model on your own dataset.
179
+
180
+ <details><summary>Click to expand</summary>
181
+
182
+ </details>
183
+ -->
184
+
185
+ <!--
186
+ ### Out-of-Scope Use
187
+
188
+ *List how the model may foreseeably be misused and address what users ought not to do with the model.*
189
+ -->
190
+
191
+ ## Evaluation
192
+
193
+ ### Metrics
194
+
195
+ #### Cross Encoder Reranking
196
+
197
+ * Dataset: `dev`
198
+ * Evaluated with [<code>CrossEncoderRerankingEvaluator</code>](https://sbert.net/docs/package_reference/cross_encoder/evaluation.html#sentence_transformers.cross_encoder.evaluation.CrossEncoderRerankingEvaluator) with these parameters:
199
+ ```json
200
+ {
201
+ "at_k": 10
202
+ }
203
+ ```
204
+
205
+ | Metric | Value |
206
+ |:------------|:-----------|
207
+ | map | 0.9441 |
208
+ | mrr@10 | 0.9441 |
209
+ | **ndcg@10** | **0.9704** |
210
+
211
+ #### Cross Encoder Reranking
212
+
213
+ * Datasets: `NanoNQ_R100`, `NanoSCIDOCS_R100` and `NanoSciFact_R100`
214
+ * Evaluated with [<code>CrossEncoderRerankingEvaluator</code>](https://sbert.net/docs/package_reference/cross_encoder/evaluation.html#sentence_transformers.cross_encoder.evaluation.CrossEncoderRerankingEvaluator) with these parameters:
215
+ ```json
216
+ {
217
+ "at_k": 10,
218
+ "always_rerank_positives": true
219
+ }
220
+ ```
221
+
222
+ | Metric | NanoNQ_R100 | NanoSCIDOCS_R100 | NanoSciFact_R100 |
223
+ |:------------|:---------------------|:---------------------|:---------------------|
224
+ | map | 0.2676 (-0.1520) | 0.2425 (-0.0318) | 0.6758 (+0.0060) |
225
+ | mrr@10 | 0.3040 (-0.1227) | 0.5271 (-0.0324) | 0.6809 (+0.0028) |
226
+ | **ndcg@10** | **0.3307 (-0.1699)** | **0.2968 (-0.0384)** | **0.7085 (-0.0014)** |
227
+
228
+ #### Cross Encoder Nano BEIR
229
+
230
+ * Dataset: `NanoBEIR_R100_mean`
231
+ * Evaluated with [<code>CrossEncoderNanoBEIREvaluator</code>](https://sbert.net/docs/package_reference/cross_encoder/evaluation.html#sentence_transformers.cross_encoder.evaluation.CrossEncoderNanoBEIREvaluator) with these parameters:
232
+ ```json
233
+ {
234
+ "dataset_names": [
235
+ "nq",
236
+ "scidocs",
237
+ "scifact"
238
+ ],
239
+ "rerank_k": 100,
240
+ "at_k": 10,
241
+ "always_rerank_positives": true
242
+ }
243
+ ```
244
+
245
+ | Metric | Value |
246
+ |:------------|:---------------------|
247
+ | map | 0.3953 (-0.0593) |
248
+ | mrr@10 | 0.5040 (-0.0508) |
249
+ | **ndcg@10** | **0.4453 (-0.0699)** |
250
+
251
+ <!--
252
+ ## Bias, Risks and Limitations
253
+
254
+ *What are the known or foreseeable issues stemming from this model? You could also flag here known failure cases or weaknesses of the model.*
255
+ -->
256
+
257
+ <!--
258
+ ### Recommendations
259
+
260
+ *What are recommendations with respect to the foreseeable issues? For example, filtering explicit content.*
261
+ -->
262
+
263
+ ## Training Details
264
+
265
+ ### Training Dataset
266
+
267
+ #### Unnamed Dataset
268
+
269
+ * Size: 330,152 training samples
270
+ * Columns: <code>query</code>, <code>document</code>, <code>negative_1</code>, <code>negative_2</code>, <code>negative_3</code>, and <code>negative_4</code>
271
+ * Approximate statistics based on the first 1000 samples:
272
+ | | query | document | negative_1 | negative_2 | negative_3 | negative_4 |
273
+ |:--------|:------------------------------------------------------------------------------------------------|:------------------------------------------------------------------------------------------------|:-------------------------------------------------------------------------------------------------|:-------------------------------------------------------------------------------------------------|:-------------------------------------------------------------------------------------------------|:-------------------------------------------------------------------------------------------------|
274
+ | type | string | string | string | string | string | string |
275
+ | details | <ul><li>min: 29 characters</li><li>mean: 62.28 characters</li><li>max: 149 characters</li></ul> | <ul><li>min: 4 characters</li><li>mean: 774.71 characters</li><li>max: 999 characters</li></ul> | <ul><li>min: 49 characters</li><li>mean: 786.29 characters</li><li>max: 999 characters</li></ul> | <ul><li>min: 22 characters</li><li>mean: 797.75 characters</li><li>max: 999 characters</li></ul> | <ul><li>min: 39 characters</li><li>mean: 805.68 characters</li><li>max: 999 characters</li></ul> | <ul><li>min: 27 characters</li><li>mean: 787.87 characters</li><li>max: 999 characters</li></ul> |
276
+ * Samples:
277
+ | query | document | negative_1 | negative_2 | negative_3 | negative_4 |
278
+ |:---------------------------------------------------------------------------------------------------------------|:------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
279
+ | <code>neuropeptides in small intestine enteric system</code> | <code>the enteric system in the gut wall (figure 6–2) is the most extensively studied system containing nanc neurons in addition to cholinergic and adrenergic fibers. in the small intestine, for example, these neurons contain one or more of the following: nitric oxide synthase (which produces nitric oxide, no), calcitonin gene-related peptide, cholecystokinin, dynorphin, enkephalins, gastrin-releasing peptide, 5-hydroxytryptamine (5-ht, serotonin), neuropeptide y, somatostatin, substance p, and vasoactive intestinal peptide (vip). some neurons contain as many as five different transmitters.</code> | <code>28-4). small-intestinal absorption and secretion are tightly regulated; derangements in water and electrolyte homeostasis characteristic of many of the disorders discussed in this chapter play an important role in contributing to their associated clinical features.gut epithelia have two pathways for water transport: (a) the paracellular route, which involves transport through the spaces between cells, (b) the transcellular route, through apical and the basolateral cell membranes, with most occurring through brunicardi_ch28_p1219-p1258.indd 122223/02/19 2:24 pm 1223small intestinechapter 28the transcellular pathway.4 the specific transport mechanisms mediating this transcellular transport are not completely char-acterized, and they may involve passive diffusion through the phospholipid bilayer, cotransport with other ions and nutrients, or diffusion through water channels called aquaporins. many different types of aquaporins have been identified; however, their contribution to</code> | <code>like extensions of the apical surface of each intesti-nal epithelial cell (enterocyte), further increase the surface for absorption of metabolites. mucosal glands extend into the lamina propria. they contain the stem cells and developing cells that will ultimately migrate to the surface of the villi. in the duodenum, submucosal glands (brunner’s glands) secrete an alkaline mucus that helps to neutralize the acidic chyme. enterocytes not only absorb metabolites digested in the intestinal lumen but also synthesize enzymes inserted into the membrane of the mi-crovilli for terminal digestion of disaccharides and dipeptides.</code> | <code>a well-known substance that appears to act as a paracrine hormone within the gastrointestinal tract and pancreas is somatostatin, which inhibits other gas-trointestinal and pancreatic islet endocrine cells. in addition to the established gastrointestinal hormones, several gastrointestinal peptides have not been definitely classified as hormones or paracrine hormones. these pep-tides are designated candidate or putative hormones. other locally active agents isolated from the gastroin-testinal mucosa are neurotransmitters. these agents are released from nerve endings close to the target cell, usu-ally the smooth muscle of the muscularis mucosae, the muscularis externa, or the tunica media of a blood vessel. enteroendocrine cells can also secrete neurotransmitters that activate afferent neurons, sending signals to the cns and enteric division of autonomic nervous system. in addi-tion to acetylcholine (not a peptide), peptides found in nerve fibers of the gastrointestinal tract are</code> | <code>activity in the enteric nervous system is modulated by the sympathetic nervous system. sympathetic post-ganglionic neurons that contain norepinephrine inhibit intestinal motility, those that contain norepinephrine and neuropeptide y regulate blood flow, and those that contain norepinephrine and somatostatin control intestinal secretion. feedback is provided by intestinofugal neurons that project back from the myenteric plexus to the sympathetic ganglia. the submucosal plexus regulates ion and water transport across the intestinal epithelium and glandular secretion. it also communicates with the myenteric plexus to ensure coordination of the functions of the two components of the enteric nervous system. the neurons and neural circuits of the submucosal plexus are not as well understood as those of the myenteric plexus, but many of the neurons contain neuropeptides, and the neural networks are well organized.</code> |
280
+ | <code>how does the timing of rubella virus infection during pregnancy affect the outcome for the fetus?</code> | <code>congenital rubella syndrome the most serious consequence of rubella virus infection can develop when a woman becomes infected during pregnancy, particularly during the first trimester. the resulting complications may include miscarriage, fetal death, premature delivery, or live birth with congenital defects. infants infected with rubella virus in utero may have myriad physical defects (table 230e-1), which most commonly relate to the eyes, ears, and heart. this constellation of severe birth defects is known as congenital rubella syndrome. in addition to permanent manifestations, there are a host of transient physical manifestations, including thrombocytopenia with purpura/petechiae (e.g., dermal erythropoiesis, “blueberry muffin syndrome”). some infants may be born with congenital rubella virus infection but have no apparent signs or symptoms of crs and are referred to as “infants with congenital rubella infection only.”</code> | <code>figure 230e-2 countries using rubella vaccine in their national immunization schedule, 2012. (from the world health organization.) is probably lifelong. the most commonly used vaccine globally is the ra27/3 virus strain. the current recommendation for routine rubella vaccination in the united states is a first dose of mmr vaccine at 12–15 months of age and a second dose at 4–6 years. target groups for rubella vaccine include children ≥1 year of age, adolescents and adults without documented evidence of immunity, individuals in congregate settings (e.g., college students, military personnel, child care and health care workers), and susceptible women before and after pregnancy.</code> | <code>the neuropathology is of considerable interest. in the nervous system of fetuses exposed to maternal rubella in the first trimester, r.d. adams found no visible lesions by light microscopy, even though the virus had been isolated from the brain by enders (personal communications). at this period of development there is no inflammatory reaction because of the absence of polymorphonuclear leukocytes, lymphocytes, and other mononuclear cells in the fetus. at birth the brain is usually of normal size, and there may be no discernible lesions. there may be a mild meningeal infiltration of lymphocytes, and a few zones of necrosis and vasculitis with later calcification of vessels are seen, as are small hemorrhages, presumably related to the thrombocytopenia. smallness of the brain and delay in myelination have been observed in children who died at 1 to 2 years of age. none of the brains in adams’ series was malformed. rubella virus continues to be recoverable from the csf for at least 18</code> | <code>rubella (german measles) also spreads from the hairline downward; unlike that of measles, however, the rash of rubella tends to clear from originally affected areas as it migrates, and it may be pruritic (chap. 230e). forchheimer spots (palatal petechiae) may develop but are nonspecific because they also develop in infectious mononucleosis (chap. 218) and scarlet fever (chap. 173). postauricular and suboccipital adenopathy and arthritis are common among adults with rubella. exposure of pregnant women to ill individuals should be avoided, as rubella causes severe congenital abnormalities. numerous strains of enteroviruses (chap. 228), primarily echoviruses and coxsackieviruses, cause nonspecific syndromes of fever and eruptions that may mimic rubella or measles. patients with infectious mononucleosis caused by epstein-barr virus (chap. 218) or with primary hiv infection (chap. 226) may exhibit pharyngitis, lymphadenopathy, and a nonspecific maculopapular exanthem.</code> | <code>after the isolation of rubella virus in the early 1960s and the occurrence of a devastating pandemic, a vaccine for rubella was developed and licensed in 1969. currently, the majority of rubella-containing vaccines (rcvs) used worldwide are combined measles and rubella (mr) or measles, mumps, and rubella (mmr) formulations. a tetravalent measles, mumps, rubella, and varicella (mmrv) vaccine is available but is not widely used. the public health burden of rubella infection is measured primarily through the resulting crs cases. the 1964–1965 rubella epidemic in the united states encompassed >30,000 infections during pregnancy. crs occurred in ~20,000 infants born alive, including >11,000 infants who were deaf, >3500 infants who were blind, and almost 2000 infants who were mentally retarded. the cost of this epidemic exceeded $1.5 billion. in 1983, the cost per child with crs was estimated at $200,000.</code> |
281
+ | <code>structure and function of β barrels in membrane proteins</code> | <code>most multipass membrane proteins in eukaryotic cells and in the bacterial plasma membrane are constructed from transmembrane α helices. the helices figure 10–22 steps in the folding of a multipass transmembrane protein. when a newly synthesized transmembrane α helix is released into the lipid bilayer, it is initially surrounded by lipid molecules. as the protein folds, contacts between the helices displace some of the lipid molecules surrounding the helices. figure 10–23 β barrels formed from different numbers of β strands.</code> | <code>β-barrel proteins are abundant in the outer membranes of bacteria, mitochondria, and chloroplasts. some are pore-forming proteins, which create water-filled channels that allow selected small hydrophilic molecules to cross the membrane. the porins are well-studied examples (example 3 in figure 10–23c). many porin barrels are formed from a 16-strand, antiparallel β sheet rolled up into a cylindrical structure. polar amino acid side chains line the aqueous channel on the inside, while nonpolar side chains project from the outside of the barrel to interact with the hydrophobic core of the lipid bilayer. loops of the polypeptide chain often protrude into the lumen of the channel, narrowing it so that only certain solutes can pass. some porins are therefore highly selective: maltoporin, for example, preferentially allows maltose and maltose oligomers to cross the outer membrane of e. coli.</code> | <code>figure 3–8 two types of β sheet structures. (a) an antiparallel β sheet (see figure 3–7c). (b) a parallel β sheet. both of these structures are common in proteins.</code> | <code>the cores of many proteins contain extensive regions of β sheet. as shown in figure 3–8, these β sheets can form either from neighboring segments of the polypeptide backbone that run in the same orientation (parallel chains) or from a polypeptide backbone that folds back and forth upon itself, with each section of the chain running in the direction opposite to that of its immediate neighbors (antiparallel chains). both types of β sheet produce a very rigid structure, held together by hydrogen bonds that connect the peptide bonds in neighboring chains (see figure 3–7c).</code> | <code>one of the central subunits of the sam complex is homologous to a bacterial outer membrane protein that helps insert β-barrel proteins into the bacterial outer figure 12–24 integration of porins into the outer mitochondrial and bacterial membranes. (a) after translocation through the tom complex in the outer mitochondrial membrane, β-barrel proteins bind to chaperones in the intermembrane space. the sam complex then inserts the unfolded polypeptide chain into the outer membrane and helps the chain fold. (b) a structurally related bam complex in the outer membrane of gram-negative bacteria catalyzes β-barrel protein insertion and folding (see figure 11–17). membrane from the periplasmic space (the equivalent of the intermembrane space in mitochondria) (figure 12–24b). this conserved pathway for inserting β-barrel proteins further underscores the endosymbiotic origin of mitochondria. transport into the inner mitochondrial membrane and intermembrane space occurs via several routes</code> |
282
+ * Loss: [<code>CachedMultipleNegativesRankingLoss</code>](https://sbert.net/docs/package_reference/cross_encoder/losses.html#cachedmultiplenegativesrankingloss) with these parameters:
283
+ ```json
284
+ {
285
+ "scale": 10.0,
286
+ "num_negatives": 4,
287
+ "activation_fn": "torch.nn.modules.activation.Sigmoid",
288
+ "mini_batch_size": 32
289
+ }
290
+ ```
291
+
292
+ ### Training Hyperparameters
293
+ #### Non-Default Hyperparameters
294
+
295
+ - `eval_strategy`: steps
296
+ - `per_device_train_batch_size`: 4
297
+ - `per_device_eval_batch_size`: 4
298
+ - `learning_rate`: 2e-05
299
+ - `num_train_epochs`: 1
300
+ - `warmup_ratio`: 0.1
301
+ - `bf16`: True
302
+ - `dataloader_num_workers`: 4
303
+ - `load_best_model_at_end`: True
304
+
305
+ #### All Hyperparameters
306
+ <details><summary>Click to expand</summary>
307
+
308
+ - `overwrite_output_dir`: False
309
+ - `do_predict`: False
310
+ - `eval_strategy`: steps
311
+ - `prediction_loss_only`: True
312
+ - `per_device_train_batch_size`: 4
313
+ - `per_device_eval_batch_size`: 4
314
+ - `per_gpu_train_batch_size`: None
315
+ - `per_gpu_eval_batch_size`: None
316
+ - `gradient_accumulation_steps`: 1
317
+ - `eval_accumulation_steps`: None
318
+ - `torch_empty_cache_steps`: None
319
+ - `learning_rate`: 2e-05
320
+ - `weight_decay`: 0.0
321
+ - `adam_beta1`: 0.9
322
+ - `adam_beta2`: 0.999
323
+ - `adam_epsilon`: 1e-08
324
+ - `max_grad_norm`: 1.0
325
+ - `num_train_epochs`: 1
326
+ - `max_steps`: -1
327
+ - `lr_scheduler_type`: linear
328
+ - `lr_scheduler_kwargs`: {}
329
+ - `warmup_ratio`: 0.1
330
+ - `warmup_steps`: 0
331
+ - `log_level`: passive
332
+ - `log_level_replica`: warning
333
+ - `log_on_each_node`: True
334
+ - `logging_nan_inf_filter`: True
335
+ - `save_safetensors`: True
336
+ - `save_on_each_node`: False
337
+ - `save_only_model`: False
338
+ - `restore_callback_states_from_checkpoint`: False
339
+ - `no_cuda`: False
340
+ - `use_cpu`: False
341
+ - `use_mps_device`: False
342
+ - `seed`: 42
343
+ - `data_seed`: None
344
+ - `jit_mode_eval`: False
345
+ - `use_ipex`: False
346
+ - `bf16`: True
347
+ - `fp16`: False
348
+ - `fp16_opt_level`: O1
349
+ - `half_precision_backend`: auto
350
+ - `bf16_full_eval`: False
351
+ - `fp16_full_eval`: False
352
+ - `tf32`: None
353
+ - `local_rank`: 0
354
+ - `ddp_backend`: None
355
+ - `tpu_num_cores`: None
356
+ - `tpu_metrics_debug`: False
357
+ - `debug`: []
358
+ - `dataloader_drop_last`: False
359
+ - `dataloader_num_workers`: 4
360
+ - `dataloader_prefetch_factor`: None
361
+ - `past_index`: -1
362
+ - `disable_tqdm`: False
363
+ - `remove_unused_columns`: True
364
+ - `label_names`: None
365
+ - `load_best_model_at_end`: True
366
+ - `ignore_data_skip`: False
367
+ - `fsdp`: []
368
+ - `fsdp_min_num_params`: 0
369
+ - `fsdp_config`: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
370
+ - `tp_size`: 0
371
+ - `fsdp_transformer_layer_cls_to_wrap`: None
372
+ - `accelerator_config`: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
373
+ - `deepspeed`: None
374
+ - `label_smoothing_factor`: 0.0
375
+ - `optim`: adamw_torch
376
+ - `optim_args`: None
377
+ - `adafactor`: False
378
+ - `group_by_length`: False
379
+ - `length_column_name`: length
380
+ - `ddp_find_unused_parameters`: None
381
+ - `ddp_bucket_cap_mb`: None
382
+ - `ddp_broadcast_buffers`: False
383
+ - `dataloader_pin_memory`: True
384
+ - `dataloader_persistent_workers`: False
385
+ - `skip_memory_metrics`: True
386
+ - `use_legacy_prediction_loop`: False
387
+ - `push_to_hub`: False
388
+ - `resume_from_checkpoint`: None
389
+ - `hub_model_id`: None
390
+ - `hub_strategy`: every_save
391
+ - `hub_private_repo`: None
392
+ - `hub_always_push`: False
393
+ - `gradient_checkpointing`: False
394
+ - `gradient_checkpointing_kwargs`: None
395
+ - `include_inputs_for_metrics`: False
396
+ - `include_for_metrics`: []
397
+ - `eval_do_concat_batches`: True
398
+ - `fp16_backend`: auto
399
+ - `push_to_hub_model_id`: None
400
+ - `push_to_hub_organization`: None
401
+ - `mp_parameters`:
402
+ - `auto_find_batch_size`: False
403
+ - `full_determinism`: False
404
+ - `torchdynamo`: None
405
+ - `ray_scope`: last
406
+ - `ddp_timeout`: 1800
407
+ - `torch_compile`: False
408
+ - `torch_compile_backend`: None
409
+ - `torch_compile_mode`: None
410
+ - `include_tokens_per_second`: False
411
+ - `include_num_input_tokens_seen`: False
412
+ - `neftune_noise_alpha`: None
413
+ - `optim_target_modules`: None
414
+ - `batch_eval_metrics`: False
415
+ - `eval_on_start`: False
416
+ - `use_liger_kernel`: False
417
+ - `eval_use_gather_object`: False
418
+ - `average_tokens_across_devices`: False
419
+ - `prompts`: None
420
+ - `batch_sampler`: batch_sampler
421
+ - `multi_dataset_batch_sampler`: proportional
422
+
423
+ </details>
424
+
425
+ ### Training Logs
426
+ | Epoch | Step | Training Loss | dev_ndcg@10 | NanoNQ_R100_ndcg@10 | NanoSCIDOCS_R100_ndcg@10 | NanoSciFact_R100_ndcg@10 | NanoBEIR_R100_mean_ndcg@10 |
427
+ |:------:|:-----:|:-------------:|:-----------:|:-------------------:|:------------------------:|:------------------------:|:--------------------------:|
428
+ | 0.0000 | 1 | 4.5922 | - | - | - | - | - |
429
+ | 0.0024 | 200 | 3.3404 | - | - | - | - | - |
430
+ | 0.0048 | 400 | 2.1271 | - | - | - | - | - |
431
+ | 0.0073 | 600 | 1.4865 | - | - | - | - | - |
432
+ | 0.0097 | 800 | 0.9195 | - | - | - | - | - |
433
+ | 0.0121 | 1000 | 0.5765 | - | - | - | - | - |
434
+ | 0.0145 | 1200 | 0.4458 | - | - | - | - | - |
435
+ | 0.0170 | 1400 | 0.3502 | - | - | - | - | - |
436
+ | 0.0194 | 1600 | 0.3753 | - | - | - | - | - |
437
+ | 0.0218 | 1800 | 0.3748 | - | - | - | - | - |
438
+ | 0.0242 | 2000 | 0.3334 | - | - | - | - | - |
439
+ | 0.0267 | 2200 | 0.3678 | - | - | - | - | - |
440
+ | 0.0291 | 2400 | 0.3326 | - | - | - | - | - |
441
+ | 0.0315 | 2600 | 0.2861 | - | - | - | - | - |
442
+ | 0.0339 | 2800 | 0.3241 | - | - | - | - | - |
443
+ | 0.0363 | 3000 | 0.2778 | - | - | - | - | - |
444
+ | 0.0388 | 3200 | 0.2823 | - | - | - | - | - |
445
+ | 0.0412 | 3400 | 0.292 | - | - | - | - | - |
446
+ | 0.0436 | 3600 | 0.2853 | - | - | - | - | - |
447
+ | 0.0460 | 3800 | 0.2239 | - | - | - | - | - |
448
+ | 0.0485 | 4000 | 0.242 | - | - | - | - | - |
449
+ | 0.0509 | 4200 | 0.2607 | - | - | - | - | - |
450
+ | 0.0533 | 4400 | 0.2567 | - | - | - | - | - |
451
+ | 0.0557 | 4600 | 0.2382 | - | - | - | - | - |
452
+ | 0.0582 | 4800 | 0.1988 | - | - | - | - | - |
453
+ | 0.0606 | 5000 | 0.2184 | - | - | - | - | - |
454
+ | 0.0630 | 5200 | 0.1865 | - | - | - | - | - |
455
+ | 0.0654 | 5400 | 0.2099 | - | - | - | - | - |
456
+ | 0.0678 | 5600 | 0.2375 | - | - | - | - | - |
457
+ | 0.0703 | 5800 | 0.2399 | - | - | - | - | - |
458
+ | 0.0727 | 6000 | 0.2486 | - | - | - | - | - |
459
+ | 0.0751 | 6200 | 0.2419 | - | - | - | - | - |
460
+ | 0.0775 | 6400 | 0.1771 | - | - | - | - | - |
461
+ | 0.0800 | 6600 | 0.2185 | - | - | - | - | - |
462
+ | 0.0824 | 6800 | 0.2261 | - | - | - | - | - |
463
+ | 0.0848 | 7000 | 0.2615 | - | - | - | - | - |
464
+ | 0.0872 | 7200 | 0.2662 | - | - | - | - | - |
465
+ | 0.0897 | 7400 | 0.2042 | - | - | - | - | - |
466
+ | 0.0921 | 7600 | 0.2712 | - | - | - | - | - |
467
+ | 0.0945 | 7800 | 0.3638 | - | - | - | - | - |
468
+ | 0.0969 | 8000 | 0.2343 | - | - | - | - | - |
469
+ | 0.0993 | 8200 | 0.3492 | - | - | - | - | - |
470
+ | 0.1018 | 8400 | 0.319 | - | - | - | - | - |
471
+ | 0.1042 | 8600 | 0.3326 | - | - | - | - | - |
472
+ | 0.1066 | 8800 | 0.3436 | - | - | - | - | - |
473
+ | 0.1090 | 9000 | 0.3442 | - | - | - | - | - |
474
+ | 0.1115 | 9200 | 0.2505 | - | - | - | - | - |
475
+ | 0.1139 | 9400 | 0.3844 | - | - | - | - | - |
476
+ | 0.1163 | 9600 | 0.4207 | - | - | - | - | - |
477
+ | 0.1187 | 9800 | 0.3018 | - | - | - | - | - |
478
+ | 0.1212 | 10000 | 0.3979 | 0.9704 | 0.3307 (-0.1699) | 0.2968 (-0.0384) | 0.7085 (-0.0014) | 0.4453 (-0.0699) |
479
+
480
+
481
+ ### Framework Versions
482
+ - Python: 3.10.12
483
+ - Sentence Transformers: 4.1.0
484
+ - Transformers: 4.51.3
485
+ - PyTorch: 2.7.0+cu126
486
+ - Accelerate: 1.6.0
487
+ - Datasets: 3.5.1
488
+ - Tokenizers: 0.21.1
489
+
490
+ ## Citation
491
+
492
+ ### BibTeX
493
+
494
+ #### Sentence Transformers
495
+ ```bibtex
496
+ @inproceedings{reimers-2019-sentence-bert,
497
+ title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
498
+ author = "Reimers, Nils and Gurevych, Iryna",
499
+ booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
500
+ month = "11",
501
+ year = "2019",
502
+ publisher = "Association for Computational Linguistics",
503
+ url = "https://arxiv.org/abs/1908.10084",
504
+ }
505
+ ```
506
+
507
+ <!--
508
+ ## Glossary
509
+
510
+ *Clearly define terms in order to be accessible across audiences.*
511
+ -->
512
+
513
+ <!--
514
+ ## Model Card Authors
515
+
516
+ *Lists the people who create the model card, providing recognition and accountability for the detailed work that goes into its construction.*
517
+ -->
518
+
519
+ <!--
520
+ ## Model Card Contact
521
+
522
+ *Provides a way for people who have updates to the Model Card, suggestions, or questions, to contact the Model Card authors.*
523
+ -->
crossencoder-checkpoints/checkpoint-googlebert-10000/config.json ADDED
@@ -0,0 +1,35 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "architectures": [
3
+ "BertForSequenceClassification"
4
+ ],
5
+ "attention_probs_dropout_prob": 0.1,
6
+ "classifier_dropout": null,
7
+ "gradient_checkpointing": false,
8
+ "hidden_act": "gelu",
9
+ "hidden_dropout_prob": 0.1,
10
+ "hidden_size": 768,
11
+ "id2label": {
12
+ "0": "LABEL_0"
13
+ },
14
+ "initializer_range": 0.02,
15
+ "intermediate_size": 3072,
16
+ "label2id": {
17
+ "LABEL_0": 0
18
+ },
19
+ "layer_norm_eps": 1e-12,
20
+ "max_position_embeddings": 512,
21
+ "model_type": "bert",
22
+ "num_attention_heads": 12,
23
+ "num_hidden_layers": 12,
24
+ "pad_token_id": 0,
25
+ "position_embedding_type": "absolute",
26
+ "sentence_transformers": {
27
+ "activation_fn": "torch.nn.modules.activation.Sigmoid",
28
+ "version": "4.1.0"
29
+ },
30
+ "torch_dtype": "float32",
31
+ "transformers_version": "4.51.3",
32
+ "type_vocab_size": 2,
33
+ "use_cache": true,
34
+ "vocab_size": 28996
35
+ }
crossencoder-checkpoints/checkpoint-googlebert-10000/model.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:e36dfe2d84bbe93c64fb4878c518e63a45145d922196f39763b5d0ffa429e2ed
3
+ size 433267692
crossencoder-checkpoints/checkpoint-googlebert-10000/rng_state.pth ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:a5b1aabb1a01516a6049168abb5af7255e52a32b8d35e55b49e7164429ea4295
3
+ size 14645
crossencoder-checkpoints/checkpoint-googlebert-10000/scheduler.pt ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:f4cfb34c5e798451a2249344555252a99e1e9f9e355f0d6a6f67cfa88d46b8e1
3
+ size 1465
crossencoder-checkpoints/checkpoint-googlebert-10000/special_tokens_map.json ADDED
@@ -0,0 +1,7 @@
 
 
 
 
 
 
 
 
1
+ {
2
+ "cls_token": "[CLS]",
3
+ "mask_token": "[MASK]",
4
+ "pad_token": "[PAD]",
5
+ "sep_token": "[SEP]",
6
+ "unk_token": "[UNK]"
7
+ }
crossencoder-checkpoints/checkpoint-googlebert-10000/tokenizer.json ADDED
The diff for this file is too large to render. See raw diff
 
crossencoder-checkpoints/checkpoint-googlebert-10000/tokenizer_config.json ADDED
@@ -0,0 +1,58 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "added_tokens_decoder": {
3
+ "0": {
4
+ "content": "[PAD]",
5
+ "lstrip": false,
6
+ "normalized": false,
7
+ "rstrip": false,
8
+ "single_word": false,
9
+ "special": true
10
+ },
11
+ "100": {
12
+ "content": "[UNK]",
13
+ "lstrip": false,
14
+ "normalized": false,
15
+ "rstrip": false,
16
+ "single_word": false,
17
+ "special": true
18
+ },
19
+ "101": {
20
+ "content": "[CLS]",
21
+ "lstrip": false,
22
+ "normalized": false,
23
+ "rstrip": false,
24
+ "single_word": false,
25
+ "special": true
26
+ },
27
+ "102": {
28
+ "content": "[SEP]",
29
+ "lstrip": false,
30
+ "normalized": false,
31
+ "rstrip": false,
32
+ "single_word": false,
33
+ "special": true
34
+ },
35
+ "103": {
36
+ "content": "[MASK]",
37
+ "lstrip": false,
38
+ "normalized": false,
39
+ "rstrip": false,
40
+ "single_word": false,
41
+ "special": true
42
+ }
43
+ },
44
+ "clean_up_tokenization_spaces": false,
45
+ "cls_token": "[CLS]",
46
+ "do_lower_case": false,
47
+ "extra_special_tokens": {},
48
+ "mask_token": "[MASK]",
49
+ "model_max_length": 512,
50
+ "pad_token": "[PAD]",
51
+ "padding": true,
52
+ "sep_token": "[SEP]",
53
+ "strip_accents": null,
54
+ "tokenize_chinese_chars": true,
55
+ "tokenizer_class": "BertTokenizer",
56
+ "truncation": true,
57
+ "unk_token": "[UNK]"
58
+ }
crossencoder-checkpoints/checkpoint-googlebert-10000/trainer_state.json ADDED
@@ -0,0 +1,426 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "best_global_step": 10000,
3
+ "best_metric": 0.9703932950632154,
4
+ "best_model_checkpoint": "models/google-bert/bert-base-cased-cross_encoder_dataset_finetuned_bert_base-run-20250503-141028/checkpoint-10000",
5
+ "epoch": 0.12115631587874676,
6
+ "eval_steps": 10000,
7
+ "global_step": 10000,
8
+ "is_hyper_param_search": false,
9
+ "is_local_process_zero": true,
10
+ "is_world_process_zero": true,
11
+ "log_history": [
12
+ {
13
+ "epoch": 1.2115631587874677e-05,
14
+ "grad_norm": 69.96532440185547,
15
+ "learning_rate": 0.0,
16
+ "loss": 4.5922,
17
+ "step": 1
18
+ },
19
+ {
20
+ "epoch": 0.0024231263175749354,
21
+ "grad_norm": 19.02867889404297,
22
+ "learning_rate": 4.821904531136419e-07,
23
+ "loss": 3.3404,
24
+ "step": 200
25
+ },
26
+ {
27
+ "epoch": 0.004846252635149871,
28
+ "grad_norm": 11.468595504760742,
29
+ "learning_rate": 9.6680397383087e-07,
30
+ "loss": 2.1271,
31
+ "step": 400
32
+ },
33
+ {
34
+ "epoch": 0.007269378952724805,
35
+ "grad_norm": 23.1657772064209,
36
+ "learning_rate": 1.451417494548098e-06,
37
+ "loss": 1.4865,
38
+ "step": 600
39
+ },
40
+ {
41
+ "epoch": 0.009692505270299741,
42
+ "grad_norm": 38.3974494934082,
43
+ "learning_rate": 1.936031015265326e-06,
44
+ "loss": 0.9195,
45
+ "step": 800
46
+ },
47
+ {
48
+ "epoch": 0.012115631587874676,
49
+ "grad_norm": 33.8629264831543,
50
+ "learning_rate": 2.420644535982554e-06,
51
+ "loss": 0.5765,
52
+ "step": 1000
53
+ },
54
+ {
55
+ "epoch": 0.01453875790544961,
56
+ "grad_norm": 77.52400207519531,
57
+ "learning_rate": 2.9052580566997825e-06,
58
+ "loss": 0.4458,
59
+ "step": 1200
60
+ },
61
+ {
62
+ "epoch": 0.016961884223024547,
63
+ "grad_norm": 44.24454116821289,
64
+ "learning_rate": 3.3898715774170105e-06,
65
+ "loss": 0.3502,
66
+ "step": 1400
67
+ },
68
+ {
69
+ "epoch": 0.019385010540599483,
70
+ "grad_norm": 0.05398047715425491,
71
+ "learning_rate": 3.874485098134238e-06,
72
+ "loss": 0.3753,
73
+ "step": 1600
74
+ },
75
+ {
76
+ "epoch": 0.021808136858174416,
77
+ "grad_norm": 60.81565856933594,
78
+ "learning_rate": 4.3590986188514665e-06,
79
+ "loss": 0.3748,
80
+ "step": 1800
81
+ },
82
+ {
83
+ "epoch": 0.024231263175749352,
84
+ "grad_norm": 58.663352966308594,
85
+ "learning_rate": 4.8437121395686945e-06,
86
+ "loss": 0.3334,
87
+ "step": 2000
88
+ },
89
+ {
90
+ "epoch": 0.026654389493324288,
91
+ "grad_norm": 252.8543243408203,
92
+ "learning_rate": 5.3283256602859225e-06,
93
+ "loss": 0.3678,
94
+ "step": 2200
95
+ },
96
+ {
97
+ "epoch": 0.02907751581089922,
98
+ "grad_norm": 75.89717102050781,
99
+ "learning_rate": 5.81293918100315e-06,
100
+ "loss": 0.3326,
101
+ "step": 2400
102
+ },
103
+ {
104
+ "epoch": 0.03150064212847416,
105
+ "grad_norm": 38.77901840209961,
106
+ "learning_rate": 6.2975527017203786e-06,
107
+ "loss": 0.2861,
108
+ "step": 2600
109
+ },
110
+ {
111
+ "epoch": 0.033923768446049093,
112
+ "grad_norm": 43.75902557373047,
113
+ "learning_rate": 6.782166222437606e-06,
114
+ "loss": 0.3241,
115
+ "step": 2800
116
+ },
117
+ {
118
+ "epoch": 0.03634689476362403,
119
+ "grad_norm": 9.858089447021484,
120
+ "learning_rate": 7.266779743154835e-06,
121
+ "loss": 0.2778,
122
+ "step": 3000
123
+ },
124
+ {
125
+ "epoch": 0.038770021081198966,
126
+ "grad_norm": 1.7009317874908447,
127
+ "learning_rate": 7.751393263872062e-06,
128
+ "loss": 0.2823,
129
+ "step": 3200
130
+ },
131
+ {
132
+ "epoch": 0.041193147398773895,
133
+ "grad_norm": 1.1402368545532227,
134
+ "learning_rate": 8.23600678458929e-06,
135
+ "loss": 0.292,
136
+ "step": 3400
137
+ },
138
+ {
139
+ "epoch": 0.04361627371634883,
140
+ "grad_norm": 35.95049285888672,
141
+ "learning_rate": 8.720620305306518e-06,
142
+ "loss": 0.2853,
143
+ "step": 3600
144
+ },
145
+ {
146
+ "epoch": 0.04603940003392377,
147
+ "grad_norm": 7.911437034606934,
148
+ "learning_rate": 9.205233826023747e-06,
149
+ "loss": 0.2239,
150
+ "step": 3800
151
+ },
152
+ {
153
+ "epoch": 0.048462526351498704,
154
+ "grad_norm": 20.524240493774414,
155
+ "learning_rate": 9.689847346740975e-06,
156
+ "loss": 0.242,
157
+ "step": 4000
158
+ },
159
+ {
160
+ "epoch": 0.05088565266907364,
161
+ "grad_norm": 29.201196670532227,
162
+ "learning_rate": 1.0174460867458203e-05,
163
+ "loss": 0.2607,
164
+ "step": 4200
165
+ },
166
+ {
167
+ "epoch": 0.053308778986648576,
168
+ "grad_norm": 8.403849601745605,
169
+ "learning_rate": 1.065907438817543e-05,
170
+ "loss": 0.2567,
171
+ "step": 4400
172
+ },
173
+ {
174
+ "epoch": 0.055731905304223506,
175
+ "grad_norm": 0.07258583605289459,
176
+ "learning_rate": 1.114368790889266e-05,
177
+ "loss": 0.2382,
178
+ "step": 4600
179
+ },
180
+ {
181
+ "epoch": 0.05815503162179844,
182
+ "grad_norm": 8.787927627563477,
183
+ "learning_rate": 1.1628301429609888e-05,
184
+ "loss": 0.1988,
185
+ "step": 4800
186
+ },
187
+ {
188
+ "epoch": 0.06057815793937338,
189
+ "grad_norm": 38.405887603759766,
190
+ "learning_rate": 1.2112914950327115e-05,
191
+ "loss": 0.2184,
192
+ "step": 5000
193
+ },
194
+ {
195
+ "epoch": 0.06300128425694831,
196
+ "grad_norm": 12.370210647583008,
197
+ "learning_rate": 1.2597528471044342e-05,
198
+ "loss": 0.1865,
199
+ "step": 5200
200
+ },
201
+ {
202
+ "epoch": 0.06542441057452325,
203
+ "grad_norm": 0.00251359143294394,
204
+ "learning_rate": 1.3082141991761572e-05,
205
+ "loss": 0.2099,
206
+ "step": 5400
207
+ },
208
+ {
209
+ "epoch": 0.06784753689209819,
210
+ "grad_norm": 215.8644256591797,
211
+ "learning_rate": 1.35667555124788e-05,
212
+ "loss": 0.2375,
213
+ "step": 5600
214
+ },
215
+ {
216
+ "epoch": 0.07027066320967312,
217
+ "grad_norm": 7.582530975341797,
218
+ "learning_rate": 1.4051369033196027e-05,
219
+ "loss": 0.2399,
220
+ "step": 5800
221
+ },
222
+ {
223
+ "epoch": 0.07269378952724806,
224
+ "grad_norm": 110.00708770751953,
225
+ "learning_rate": 1.4535982553913256e-05,
226
+ "loss": 0.2486,
227
+ "step": 6000
228
+ },
229
+ {
230
+ "epoch": 0.075116915844823,
231
+ "grad_norm": 9.572257041931152,
232
+ "learning_rate": 1.5020596074630483e-05,
233
+ "loss": 0.2419,
234
+ "step": 6200
235
+ },
236
+ {
237
+ "epoch": 0.07754004216239793,
238
+ "grad_norm": 42.000762939453125,
239
+ "learning_rate": 1.550520959534771e-05,
240
+ "loss": 0.1771,
241
+ "step": 6400
242
+ },
243
+ {
244
+ "epoch": 0.07996316847997285,
245
+ "grad_norm": 55.88412094116211,
246
+ "learning_rate": 1.598982311606494e-05,
247
+ "loss": 0.2185,
248
+ "step": 6600
249
+ },
250
+ {
251
+ "epoch": 0.08238629479754779,
252
+ "grad_norm": 0.12212779372930527,
253
+ "learning_rate": 1.6474436636782166e-05,
254
+ "loss": 0.2261,
255
+ "step": 6800
256
+ },
257
+ {
258
+ "epoch": 0.08480942111512273,
259
+ "grad_norm": 0.22978061437606812,
260
+ "learning_rate": 1.6959050157499395e-05,
261
+ "loss": 0.2615,
262
+ "step": 7000
263
+ },
264
+ {
265
+ "epoch": 0.08723254743269766,
266
+ "grad_norm": 13.786860466003418,
267
+ "learning_rate": 1.7443663678216624e-05,
268
+ "loss": 0.2662,
269
+ "step": 7200
270
+ },
271
+ {
272
+ "epoch": 0.0896556737502726,
273
+ "grad_norm": 1.807544231414795,
274
+ "learning_rate": 1.7928277198933853e-05,
275
+ "loss": 0.2042,
276
+ "step": 7400
277
+ },
278
+ {
279
+ "epoch": 0.09207880006784754,
280
+ "grad_norm": 200.43603515625,
281
+ "learning_rate": 1.841289071965108e-05,
282
+ "loss": 0.2712,
283
+ "step": 7600
284
+ },
285
+ {
286
+ "epoch": 0.09450192638542247,
287
+ "grad_norm": 6.414185047149658,
288
+ "learning_rate": 1.8897504240368307e-05,
289
+ "loss": 0.3638,
290
+ "step": 7800
291
+ },
292
+ {
293
+ "epoch": 0.09692505270299741,
294
+ "grad_norm": 19.78321075439453,
295
+ "learning_rate": 1.9382117761085536e-05,
296
+ "loss": 0.2343,
297
+ "step": 8000
298
+ },
299
+ {
300
+ "epoch": 0.09934817902057234,
301
+ "grad_norm": 0.0026795840822160244,
302
+ "learning_rate": 1.9866731281802765e-05,
303
+ "loss": 0.3492,
304
+ "step": 8200
305
+ },
306
+ {
307
+ "epoch": 0.10177130533814728,
308
+ "grad_norm": 0.12078650295734406,
309
+ "learning_rate": 1.9960960637553176e-05,
310
+ "loss": 0.319,
311
+ "step": 8400
312
+ },
313
+ {
314
+ "epoch": 0.10419443165572222,
315
+ "grad_norm": 0.06971794366836548,
316
+ "learning_rate": 1.9907113241074796e-05,
317
+ "loss": 0.3326,
318
+ "step": 8600
319
+ },
320
+ {
321
+ "epoch": 0.10661755797329715,
322
+ "grad_norm": 4.194886207580566,
323
+ "learning_rate": 1.9853265844596416e-05,
324
+ "loss": 0.3436,
325
+ "step": 8800
326
+ },
327
+ {
328
+ "epoch": 0.10904068429087209,
329
+ "grad_norm": 0.14339782297611237,
330
+ "learning_rate": 1.9799418448118035e-05,
331
+ "loss": 0.3442,
332
+ "step": 9000
333
+ },
334
+ {
335
+ "epoch": 0.11146381060844701,
336
+ "grad_norm": 0.16107772290706635,
337
+ "learning_rate": 1.9745571051639655e-05,
338
+ "loss": 0.2505,
339
+ "step": 9200
340
+ },
341
+ {
342
+ "epoch": 0.11388693692602195,
343
+ "grad_norm": 0.0043859235011041164,
344
+ "learning_rate": 1.9691723655161275e-05,
345
+ "loss": 0.3844,
346
+ "step": 9400
347
+ },
348
+ {
349
+ "epoch": 0.11631006324359688,
350
+ "grad_norm": 17.954904556274414,
351
+ "learning_rate": 1.9637876258682895e-05,
352
+ "loss": 0.4207,
353
+ "step": 9600
354
+ },
355
+ {
356
+ "epoch": 0.11873318956117182,
357
+ "grad_norm": 0.1238822266459465,
358
+ "learning_rate": 1.9584028862204514e-05,
359
+ "loss": 0.3018,
360
+ "step": 9800
361
+ },
362
+ {
363
+ "epoch": 0.12115631587874676,
364
+ "grad_norm": 0.001291484571993351,
365
+ "learning_rate": 1.9530181465726134e-05,
366
+ "loss": 0.3979,
367
+ "step": 10000
368
+ },
369
+ {
370
+ "epoch": 0.12115631587874676,
371
+ "eval_NanoBEIR_R100_mean_base_map": 0.45455600104354227,
372
+ "eval_NanoBEIR_R100_mean_base_mrr@10": 0.5547539682539683,
373
+ "eval_NanoBEIR_R100_mean_base_ndcg@10": 0.5152256348613615,
374
+ "eval_NanoBEIR_R100_mean_map": 0.3952857369390113,
375
+ "eval_NanoBEIR_R100_mean_mrr@10": 0.503989417989418,
376
+ "eval_NanoBEIR_R100_mean_ndcg@10": 0.4453369270017804,
377
+ "eval_NanoNQ_R100_base_map": 0.4196061957396544,
378
+ "eval_NanoNQ_R100_base_mrr@10": 0.4266904761904762,
379
+ "eval_NanoNQ_R100_base_ndcg@10": 0.5006467934630127,
380
+ "eval_NanoNQ_R100_map": 0.26760554447403784,
381
+ "eval_NanoNQ_R100_mrr@10": 0.30400000000000005,
382
+ "eval_NanoNQ_R100_ndcg@10": 0.330741865534241,
383
+ "eval_NanoSCIDOCS_R100_base_map": 0.27430707601124094,
384
+ "eval_NanoSCIDOCS_R100_base_mrr@10": 0.5595238095238095,
385
+ "eval_NanoSCIDOCS_R100_base_ndcg@10": 0.33512313493909396,
386
+ "eval_NanoSCIDOCS_R100_map": 0.24249363273783214,
387
+ "eval_NanoSCIDOCS_R100_mrr@10": 0.527079365079365,
388
+ "eval_NanoSCIDOCS_R100_ndcg@10": 0.2967566724802968,
389
+ "eval_NanoSciFact_R100_base_map": 0.6697547313797314,
390
+ "eval_NanoSciFact_R100_base_mrr@10": 0.678047619047619,
391
+ "eval_NanoSciFact_R100_base_ndcg@10": 0.709906976181978,
392
+ "eval_NanoSciFact_R100_map": 0.6757580336051641,
393
+ "eval_NanoSciFact_R100_mrr@10": 0.6808888888888889,
394
+ "eval_NanoSciFact_R100_ndcg@10": 0.7085122429908034,
395
+ "eval_dev_map": 0.9441464035183368,
396
+ "eval_dev_mrr@10": 0.9441464035183368,
397
+ "eval_dev_ndcg@10": 0.9703932950632154,
398
+ "eval_runtime": 2239.1601,
399
+ "eval_samples_per_second": 0.0,
400
+ "eval_sequential_score": 0.4453369270017804,
401
+ "eval_steps_per_second": 0.0,
402
+ "step": 10000
403
+ }
404
+ ],
405
+ "logging_steps": 200,
406
+ "max_steps": 82538,
407
+ "num_input_tokens_seen": 0,
408
+ "num_train_epochs": 1,
409
+ "save_steps": 10000,
410
+ "stateful_callbacks": {
411
+ "TrainerControl": {
412
+ "args": {
413
+ "should_epoch_stop": false,
414
+ "should_evaluate": false,
415
+ "should_log": false,
416
+ "should_save": true,
417
+ "should_training_stop": false
418
+ },
419
+ "attributes": {}
420
+ }
421
+ },
422
+ "total_flos": 0.0,
423
+ "train_batch_size": 4,
424
+ "trial_name": null,
425
+ "trial_params": null
426
+ }
crossencoder-checkpoints/checkpoint-googlebert-10000/training_args.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:40289707f5598971ba9c6d98fe4f8c454049550e6eafc2a0a7fa3a69af3ae017
3
+ size 6225
crossencoder-checkpoints/checkpoint-googlebert-10000/vocab.txt ADDED
The diff for this file is too large to render. See raw diff