Upload folder using huggingface_hub
Browse files
README.md
CHANGED
@@ -1,11 +1,13 @@
|
|
1 |
---
|
|
|
|
|
2 |
tags:
|
3 |
- sentence-transformers
|
4 |
- sentence-similarity
|
5 |
- feature-extraction
|
6 |
- generated_from_trainer
|
7 |
- dataset_size:2839738
|
8 |
-
- loss:
|
9 |
base_model: Mihaiii/gte-micro-v4
|
10 |
widget:
|
11 |
- source_sentence: 314d5e89-55f7-42b4-af19-d4d0f499a265_c808a8ec-895c-4777-9e11-e83ce34eddef
|
@@ -138,6 +140,8 @@ widget:
|
|
138 |
|
139 |
Cycling {2} ({2}, Discard this card: Draw a card.)'
|
140 |
- https://cards.scryfall.io/normal/front/0/3/0367fac8-6990-4544-ac7d-ed363b55a9cf.jpg?1562700664
|
|
|
|
|
141 |
pipeline_tag: sentence-similarity
|
142 |
library_name: sentence-transformers
|
143 |
metrics:
|
@@ -154,16 +158,10 @@ model-index:
|
|
154 |
type: sts-dev
|
155 |
metrics:
|
156 |
- type: pearson_cosine
|
157 |
-
value: 0.
|
158 |
name: Pearson Cosine
|
159 |
- type: spearman_cosine
|
160 |
-
value: 0.
|
161 |
-
name: Spearman Cosine
|
162 |
-
- type: pearson_cosine
|
163 |
-
value: 0.43782959181274894
|
164 |
-
name: Pearson Cosine
|
165 |
-
- type: spearman_cosine
|
166 |
-
value: 0.4808140058026093
|
167 |
name: Spearman Cosine
|
168 |
- task:
|
169 |
type: semantic-similarity
|
@@ -173,16 +171,16 @@ model-index:
|
|
173 |
type: sts-test
|
174 |
metrics:
|
175 |
- type: pearson_cosine
|
176 |
-
value: 0.
|
177 |
name: Pearson Cosine
|
178 |
- type: spearman_cosine
|
179 |
-
value: 0.
|
180 |
name: Spearman Cosine
|
181 |
---
|
182 |
|
183 |
# SentenceTransformer based on Mihaiii/gte-micro-v4
|
184 |
|
185 |
-
This is a [sentence-transformers](https://www.SBERT.net) model finetuned from [Mihaiii/gte-micro-v4](https://huggingface.co/Mihaiii/gte-micro-v4) on the
|
186 |
|
187 |
## Model Details
|
188 |
|
@@ -193,8 +191,8 @@ This is a [sentence-transformers](https://www.SBERT.net) model finetuned from [M
|
|
193 |
- **Output Dimensionality:** 384 dimensions
|
194 |
- **Similarity Function:** Cosine Similarity
|
195 |
- **Training Dataset:**
|
196 |
-
-
|
197 |
-
|
198 |
<!-- - **License:** Unknown -->
|
199 |
|
200 |
### Model Sources
|
@@ -279,18 +277,8 @@ You can finetune this model on your own dataset.
|
|
279 |
|
280 |
| Metric | sts-dev | sts-test |
|
281 |
|:--------------------|:-----------|:-----------|
|
282 |
-
| pearson_cosine | 0.
|
283 |
-
| **spearman_cosine** | **0.
|
284 |
-
|
285 |
-
#### Semantic Similarity
|
286 |
-
|
287 |
-
* Dataset: `sts-dev`
|
288 |
-
* Evaluated with [<code>EmbeddingSimilarityEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.EmbeddingSimilarityEvaluator)
|
289 |
-
|
290 |
-
| Metric | Value |
|
291 |
-
|:--------------------|:-----------|
|
292 |
-
| pearson_cosine | 0.4378 |
|
293 |
-
| **spearman_cosine** | **0.4808** |
|
294 |
|
295 |
<!--
|
296 |
## Bias, Risks and Limitations
|
@@ -308,53 +296,51 @@ You can finetune this model on your own dataset.
|
|
308 |
|
309 |
### Training Dataset
|
310 |
|
311 |
-
####
|
312 |
|
313 |
-
* Dataset:
|
314 |
* Size: 2,839,738 training samples
|
315 |
* Columns: <code>uuid</code>, <code>sentence_1</code>, <code>sentence_2</code>, <code>image_1</code>, <code>image_2</code>, and <code>score</code>
|
316 |
* Approximate statistics based on the first 1000 samples:
|
317 |
-
| | uuid | sentence_1 | sentence_2 | image_1 | image_2 | score
|
318 |
-
|
319 |
-
| type | string | string | string | string | string | float
|
320 |
-
| details | <ul><li>min: 49 tokens</li><li>mean: 56.99 tokens</li><li>max: 65 tokens</li></ul> | <ul><li>min: 17 tokens</li><li>mean: 69.4 tokens</li><li>max: 180 tokens</li></ul> | <ul><li>min: 15 tokens</li><li>mean: 68.59 tokens</li><li>max: 166 tokens</li></ul> | <ul><li>min: 53 tokens</li><li>mean: 58.17 tokens</li><li>max: 64 tokens</li></ul> | <ul><li>min: 52 tokens</li><li>mean: 58.28 tokens</li><li>max: 64 tokens</li></ul> | <ul><li>min:
|
321 |
* Samples:
|
322 |
| uuid | sentence_1 | sentence_2 | image_1 | image_2 | score |
|
323 |
|:---------------------------------------------------------------------------------------|:---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:------------------------------------------------------------------------------------------------------------|:------------------------------------------------------------------------------------------------------------|:------------------|
|
324 |
| <code>08f9b863-10b7-46d6-badd-97381e6c7c5e_4330efa7-a11b-4776-9fb0-1cae8aed67b1</code> | <code>Title: Blast Zone<br>Type: Land<br>Desc: This land enters with a charge counter on it.<br>{T}: Add {C}.<br>{X}{X}, {T}: Put X charge counters on this land.<br>{3}, {T}, Sacrifice this land: Destroy each nonland permanent with mana value equal to the number of charge counters on this land.</code> | <code>Title: Tom van de Logt Bio (2000)<br>Type: Card<br>Desc: Quarterfinalist Tom van de Logt posted a perfect 6—0 record during the Standard portion of this year's World Championships. The 19-year-old Groesbeek, Holland native was playing a deck that had a big impact on the metagame this year, "Replenish." This deck used cards like Attunement and Frantic Search to put powerful enchantments, such as Parallax Wave and Opalescence, into the graveyard and then used Replenish to put them all back into play at once.</code> | <code>https://cards.scryfall.io/normal/front/0/8/08f9b863-10b7-46d6-badd-97381e6c7c5e.jpg?1674423042</code> | <code>https://cards.scryfall.io/normal/front/4/3/4330efa7-a11b-4776-9fb0-1cae8aed67b1.jpg?1562767017</code> | <code>0.25</code> |
|
325 |
-
| <code>abe9cf1e-d398-41e0-8b11-afe1015e4fd9_40cb67f7-b4e1-423b-8f55-d44ed383e778</code> | <code>Title: Coral Net<br>Cost: {U}<br>Colors: ['U']<br>Type: Enchantment — Aura<br>Desc: Enchant green or white creature<br>Enchanted creature has "At the beginning of your upkeep, sacrifice this creature unless you discard a card."</code> | <code>Title: Silumgar Butcher<br>Cost: {4}{B}<br>Colors: ['B']<br>Type: Creature — Zombie Djinn<br>Desc: Exploit (When this creature enters, you may sacrifice a creature.)<br>When this creature exploits a creature, target creature gets -3/-3 until end of turn.</code> | <code>https://cards.scryfall.io/normal/front/a/b/abe9cf1e-d398-41e0-8b11-afe1015e4fd9.jpg?1562631469</code> | <code>https://cards.scryfall.io/normal/front/4/0/40cb67f7-b4e1-423b-8f55-d44ed383e778.jpg?1562785294</code> | <code
|
326 |
-
| <code>3dd13408-b4db-42e7-bf3c-d46716538a7c_05a6dc90-3997-4911-8bd6-854c85eca35b</code> | <code>Title: Rishadan Brigand<br>Cost: {4}{U}<br>Colors: ['U']<br>Type: Creature — Human Pirate<br>Desc: Flying<br>When this creature enters, each opponent sacrifices a permanent of their choice unless they pay {3}.<br>This creature can block only creatures with flying.</code> | <code>Title: Banishing Stroke<br>Cost: {5}{W}<br>Colors: ['W']<br>Type: Instant<br>Desc: Put target artifact, creature, or enchantment on the bottom of its owner's library.<br>Miracle {W} (You may cast this card for its miracle cost when you draw it if it's the first card you drew this turn.)</code> | <code>https://cards.scryfall.io/normal/front/3/d/3dd13408-b4db-42e7-bf3c-d46716538a7c.jpg?1632145390</code> | <code>https://cards.scryfall.io/normal/front/0/5/05a6dc90-3997-4911-8bd6-854c85eca35b.jpg?1723433851</code> | <code
|
327 |
-
* Loss: [<code>
|
328 |
```json
|
329 |
{
|
330 |
-
"
|
331 |
-
"similarity_fct": "pairwise_cos_sim"
|
332 |
}
|
333 |
```
|
334 |
|
335 |
### Evaluation Dataset
|
336 |
|
337 |
-
####
|
338 |
|
339 |
-
* Dataset:
|
340 |
* Size: 74,730 evaluation samples
|
341 |
* Columns: <code>uuid</code>, <code>sentence_1</code>, <code>sentence_2</code>, <code>image_1</code>, <code>image_2</code>, and <code>score</code>
|
342 |
* Approximate statistics based on the first 1000 samples:
|
343 |
-
| | uuid | sentence_1 | sentence_2 | image_1 | image_2 | score
|
344 |
-
|
345 |
-
| type | string | string | string | string | string | float
|
346 |
-
| details | <ul><li>min: 50 tokens</li><li>mean: 56.9 tokens</li><li>max: 65 tokens</li></ul> | <ul><li>min: 14 tokens</li><li>mean: 68.44 tokens</li><li>max: 181 tokens</li></ul> | <ul><li>min: 15 tokens</li><li>mean: 69.49 tokens</li><li>max: 179 tokens</li></ul> | <ul><li>min: 52 tokens</li><li>mean: 58.22 tokens</li><li>max: 64 tokens</li></ul> | <ul><li>min: 52 tokens</li><li>mean: 58.21 tokens</li><li>max: 64 tokens</li></ul> | <ul><li>min:
|
347 |
* Samples:
|
348 |
| uuid | sentence_1 | sentence_2 | image_1 | image_2 | score |
|
349 |
|:---------------------------------------------------------------------------------------|:--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:------------------------------------------------------------------------------------------------------------|:------------------------------------------------------------------------------------------------------------|:------------------|
|
350 |
| <code>6bdd8645-aee9-44cb-acaa-2674f55cdf2f_b34bb149-2e50-462e-8b83-5c8339bb3aff</code> | <code>Title: Syr Cadian, Knight Owl<br>Cost: {3}{W}{W}<br>Colors: ['W']<br>Type: Legendary Creature — Bird Knight<br>Desc: Knightlifelink (Damage dealt by Knights you control also causes you to gain that much life.)<br>{W}: Syr Cadian gains vigilance until end of turn. Activate only from sunrise to sunset.<br>{B}: Syr Cadian gains flying until end of turn. Activate only from sunset to sunrise.</code> | <code>Title: Non-Human Cannonball<br>Cost: {2}{R}<br>Colors: ['R']<br>Type: Artifact Creature — Clown Robot<br>Desc: When this creature dies, roll a six-sided die. If the result is 4 or less, this creature deals that much damage to you.</code> | <code>https://cards.scryfall.io/normal/front/6/b/6bdd8645-aee9-44cb-acaa-2674f55cdf2f.jpg?1664317187</code> | <code>https://cards.scryfall.io/normal/front/b/3/b34bb149-2e50-462e-8b83-5c8339bb3aff.jpg?1673917877</code> | <code>0.25</code> |
|
351 |
| <code>860f4304-38f1-4c2f-a122-2590619522fd_08d6db9b-b2da-4148-aa49-8c2fecac6e32</code> | <code>Title: Hindering Light<br>Cost: {W}{U}<br>Colors: ['U', 'W']<br>Type: Instant<br>Desc: Counter target spell that targets you or a permanent you control.<br>Draw a card.</code> | <code>Title: Gleam of Resistance<br>Cost: {4}{W}<br>Colors: ['W']<br>Type: Instant<br>Desc: Creatures you control get +1/+2 until end of turn. Untap those creatures.<br>Basic landcycling {1}{W} ({1}{W}, Discard this card: Search your library for a basic land card, reveal it, put it into your hand, then shuffle.)</code> | <code>https://cards.scryfall.io/normal/front/8/6/860f4304-38f1-4c2f-a122-2590619522fd.jpg?1712353583</code> | <code>https://cards.scryfall.io/normal/front/0/8/08d6db9b-b2da-4148-aa49-8c2fecac6e32.jpg?1573505575</code> | <code>0.25</code> |
|
352 |
| <code>91b448f4-aa0c-42c7-a771-e8dd20e0520c_46f810c2-310e-42f5-ab1f-d56396cf5124</code> | <code>Title: Practiced Tactics<br>Cost: {W}<br>Colors: ['W']<br>Type: Instant<br>Desc: Choose target attacking or blocking creature. Practiced Tactics deals damage to that creature equal to twice the number of creatures in your party. (Your party consists of up to one each of Cleric, Rogue, Warrior, and Wizard.)</code> | <code>Title: Anointer Priest<br>Cost: {1}{W}<br>Colors: ['W']<br>Type: Creature — Human Cleric<br>Desc: Whenever a creature token you control enters, you gain 1 life.<br>Embalm {3}{W} ({3}{W}, Exile this card from your graveyard: Create a token that's a copy of it, except it's a white Zombie Human Cleric with no mana cost. Embalm only as a sorcery.)</code> | <code>https://cards.scryfall.io/normal/front/9/1/91b448f4-aa0c-42c7-a771-e8dd20e0520c.jpg?1604192922</code> | <code>https://cards.scryfall.io/normal/front/4/6/46f810c2-310e-42f5-ab1f-d56396cf5124.jpg?1599769231</code> | <code>0.25</code> |
|
353 |
-
* Loss: [<code>
|
354 |
```json
|
355 |
{
|
356 |
-
"
|
357 |
-
"similarity_fct": "pairwise_cos_sim"
|
358 |
}
|
359 |
```
|
360 |
|
@@ -370,7 +356,10 @@ You can finetune this model on your own dataset.
|
|
370 |
- `log_level_replica`: passive
|
371 |
- `log_on_each_node`: False
|
372 |
- `logging_nan_inf_filter`: False
|
373 |
-
- `
|
|
|
|
|
|
|
374 |
- `batch_sampler`: no_duplicates
|
375 |
|
376 |
#### All Hyperparameters
|
@@ -415,7 +404,7 @@ You can finetune this model on your own dataset.
|
|
415 |
- `jit_mode_eval`: False
|
416 |
- `use_ipex`: False
|
417 |
- `bf16`: False
|
418 |
-
- `fp16`:
|
419 |
- `fp16_opt_level`: O1
|
420 |
- `half_precision_backend`: auto
|
421 |
- `bf16_full_eval`: False
|
@@ -455,12 +444,12 @@ You can finetune this model on your own dataset.
|
|
455 |
- `dataloader_persistent_workers`: False
|
456 |
- `skip_memory_metrics`: True
|
457 |
- `use_legacy_prediction_loop`: False
|
458 |
-
- `push_to_hub`:
|
459 |
-
- `resume_from_checkpoint`:
|
460 |
-
- `hub_model_id`:
|
461 |
- `hub_strategy`: every_save
|
462 |
- `hub_private_repo`: None
|
463 |
-
- `hub_always_push`:
|
464 |
- `gradient_checkpointing`: False
|
465 |
- `gradient_checkpointing_kwargs`: None
|
466 |
- `include_inputs_for_metrics`: False
|
@@ -496,104 +485,98 @@ You can finetune this model on your own dataset.
|
|
496 |
</details>
|
497 |
|
498 |
### Training Logs
|
499 |
-
| Epoch | Step | Training Loss | sts-dev_spearman_cosine | sts-test_spearman_cosine |
|
500 |
-
|
501 |
-
| -1 | -1 | - | 0.3315 | - |
|
502 |
-
| 0.
|
503 |
-
| 0.
|
504 |
-
| 0.
|
505 |
-
| 0.
|
506 |
-
| 0.
|
507 |
-
|
|
508 |
-
| 0.
|
509 |
-
| 0.
|
510 |
-
| 0.
|
511 |
-
| 0.
|
512 |
-
| 0.
|
513 |
-
| 0.
|
514 |
-
| 0.
|
515 |
-
| 0.
|
516 |
-
| 0.
|
517 |
-
| 0.
|
518 |
-
| 0.
|
519 |
-
| 0.
|
520 |
-
| 0.
|
521 |
-
| 0.
|
522 |
-
| 0.
|
523 |
-
| 0.
|
524 |
-
| 0.
|
525 |
-
| 0.
|
526 |
-
| 0.
|
527 |
-
| 0.
|
528 |
-
| 0.
|
529 |
-
| 0.
|
530 |
-
| 0.
|
531 |
-
| 0.
|
532 |
-
| 0.
|
533 |
-
| 0.
|
534 |
-
| 0.
|
535 |
-
| 0.
|
536 |
-
| 0.
|
537 |
-
| 0.
|
538 |
-
| 0.
|
539 |
-
| 0.
|
540 |
-
| 0.
|
541 |
-
| 0.
|
542 |
-
| 0.
|
543 |
-
| 0.
|
544 |
-
| 0.
|
545 |
-
| 0.
|
546 |
-
| 0.
|
547 |
-
| 0.
|
548 |
-
| 0.
|
549 |
-
| 0.
|
550 |
-
| 0.
|
551 |
-
| 0.
|
552 |
-
| 0.
|
553 |
-
| 0.
|
554 |
-
| 0.
|
555 |
-
| 0.
|
556 |
-
| 0.
|
557 |
-
| 0.
|
558 |
-
| 0.
|
559 |
-
| 0.
|
560 |
-
| 0.
|
561 |
-
| 0.
|
562 |
-
| 0.
|
563 |
-
| 0.
|
564 |
-
| 0.
|
565 |
-
| 0.
|
566 |
-
| 0.
|
567 |
-
| 0.
|
568 |
-
| 0.
|
569 |
-
| 0.
|
570 |
-
| 0.
|
571 |
-
| 0.
|
572 |
-
| 0.
|
573 |
-
| 0.
|
574 |
-
| 0.
|
575 |
-
| 0.
|
576 |
-
| 0.
|
577 |
-
| 0.
|
578 |
-
| 0.
|
579 |
-
| 0.
|
580 |
-
| 0.
|
581 |
-
| 0.
|
582 |
-
| 0.
|
583 |
-
| 0.
|
584 |
-
| 0.
|
585 |
-
| 0.
|
586 |
-
| 0.
|
587 |
-
| 0.
|
588 |
-
| 0.
|
589 |
-
| 0.
|
590 |
-
|
|
591 |
-
| 0.9466 | 42000 | 6.8095 | - | - |
|
592 |
-
| 0.9578 | 42500 | 6.8042 | - | - |
|
593 |
-
| 0.9691 | 43000 | 6.8086 | - | - |
|
594 |
-
| 0.9804 | 43500 | 6.8106 | - | - |
|
595 |
-
| 0.9916 | 44000 | 6.8038 | - | - |
|
596 |
-
| -1 | -1 | - | - | 0.6348 |
|
597 |
|
598 |
|
599 |
### Framework Versions
|
@@ -622,17 +605,6 @@ You can finetune this model on your own dataset.
|
|
622 |
}
|
623 |
```
|
624 |
|
625 |
-
#### CoSENTLoss
|
626 |
-
```bibtex
|
627 |
-
@online{kexuefm-8847,
|
628 |
-
title={CoSENT: A more efficient sentence vector scheme than Sentence-BERT},
|
629 |
-
author={Su Jianlin},
|
630 |
-
year={2022},
|
631 |
-
month={Jan},
|
632 |
-
url={https://kexue.fm/archives/8847},
|
633 |
-
}
|
634 |
-
```
|
635 |
-
|
636 |
<!--
|
637 |
## Glossary
|
638 |
|
|
|
1 |
---
|
2 |
+
language:
|
3 |
+
- en
|
4 |
tags:
|
5 |
- sentence-transformers
|
6 |
- sentence-similarity
|
7 |
- feature-extraction
|
8 |
- generated_from_trainer
|
9 |
- dataset_size:2839738
|
10 |
+
- loss:CosineSimilarityLoss
|
11 |
base_model: Mihaiii/gte-micro-v4
|
12 |
widget:
|
13 |
- source_sentence: 314d5e89-55f7-42b4-af19-d4d0f499a265_c808a8ec-895c-4777-9e11-e83ce34eddef
|
|
|
140 |
|
141 |
Cycling {2} ({2}, Discard this card: Draw a card.)'
|
142 |
- https://cards.scryfall.io/normal/front/0/3/0367fac8-6990-4544-ac7d-ed363b55a9cf.jpg?1562700664
|
143 |
+
datasets:
|
144 |
+
- philipp-zettl/mtg_cards-2025-04-04
|
145 |
pipeline_tag: sentence-similarity
|
146 |
library_name: sentence-transformers
|
147 |
metrics:
|
|
|
158 |
type: sts-dev
|
159 |
metrics:
|
160 |
- type: pearson_cosine
|
161 |
+
value: 0.5887650824464006
|
162 |
name: Pearson Cosine
|
163 |
- type: spearman_cosine
|
164 |
+
value: 0.6572224332671058
|
|
|
|
|
|
|
|
|
|
|
|
|
165 |
name: Spearman Cosine
|
166 |
- task:
|
167 |
type: semantic-similarity
|
|
|
171 |
type: sts-test
|
172 |
metrics:
|
173 |
- type: pearson_cosine
|
174 |
+
value: 0.5859550968667168
|
175 |
name: Pearson Cosine
|
176 |
- type: spearman_cosine
|
177 |
+
value: 0.6548721245145304
|
178 |
name: Spearman Cosine
|
179 |
---
|
180 |
|
181 |
# SentenceTransformer based on Mihaiii/gte-micro-v4
|
182 |
|
183 |
+
This is a [sentence-transformers](https://www.SBERT.net) model finetuned from [Mihaiii/gte-micro-v4](https://huggingface.co/Mihaiii/gte-micro-v4) on the [mtg_cards-2025-04-04](https://huggingface.co/datasets/philipp-zettl/mtg_cards-2025-04-04) dataset. It maps sentences & paragraphs to a 384-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
|
184 |
|
185 |
## Model Details
|
186 |
|
|
|
191 |
- **Output Dimensionality:** 384 dimensions
|
192 |
- **Similarity Function:** Cosine Similarity
|
193 |
- **Training Dataset:**
|
194 |
+
- [mtg_cards-2025-04-04](https://huggingface.co/datasets/philipp-zettl/mtg_cards-2025-04-04)
|
195 |
+
- **Language:** en
|
196 |
<!-- - **License:** Unknown -->
|
197 |
|
198 |
### Model Sources
|
|
|
277 |
|
278 |
| Metric | sts-dev | sts-test |
|
279 |
|:--------------------|:-----------|:-----------|
|
280 |
+
| pearson_cosine | 0.5888 | 0.586 |
|
281 |
+
| **spearman_cosine** | **0.6572** | **0.6549** |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
282 |
|
283 |
<!--
|
284 |
## Bias, Risks and Limitations
|
|
|
296 |
|
297 |
### Training Dataset
|
298 |
|
299 |
+
#### mtg_cards-2025-04-04
|
300 |
|
301 |
+
* Dataset: [mtg_cards-2025-04-04](https://huggingface.co/datasets/philipp-zettl/mtg_cards-2025-04-04) at [a35ccc4](https://huggingface.co/datasets/philipp-zettl/mtg_cards-2025-04-04/tree/a35ccc4221eea5c0e29bba1c65d52b53c8f9d3ee)
|
302 |
* Size: 2,839,738 training samples
|
303 |
* Columns: <code>uuid</code>, <code>sentence_1</code>, <code>sentence_2</code>, <code>image_1</code>, <code>image_2</code>, and <code>score</code>
|
304 |
* Approximate statistics based on the first 1000 samples:
|
305 |
+
| | uuid | sentence_1 | sentence_2 | image_1 | image_2 | score |
|
306 |
+
|:--------|:-----------------------------------------------------------------------------------|:-----------------------------------------------------------------------------------|:------------------------------------------------------------------------------------|:-----------------------------------------------------------------------------------|:-----------------------------------------------------------------------------------|:-----------------------------------------------------------------|
|
307 |
+
| type | string | string | string | string | string | float |
|
308 |
+
| details | <ul><li>min: 49 tokens</li><li>mean: 56.99 tokens</li><li>max: 65 tokens</li></ul> | <ul><li>min: 17 tokens</li><li>mean: 69.4 tokens</li><li>max: 180 tokens</li></ul> | <ul><li>min: 15 tokens</li><li>mean: 68.59 tokens</li><li>max: 166 tokens</li></ul> | <ul><li>min: 53 tokens</li><li>mean: 58.17 tokens</li><li>max: 64 tokens</li></ul> | <ul><li>min: 52 tokens</li><li>mean: 58.28 tokens</li><li>max: 64 tokens</li></ul> | <ul><li>min: -1.0</li><li>mean: -0.43</li><li>max: 0.5</li></ul> |
|
309 |
* Samples:
|
310 |
| uuid | sentence_1 | sentence_2 | image_1 | image_2 | score |
|
311 |
|:---------------------------------------------------------------------------------------|:---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:------------------------------------------------------------------------------------------------------------|:------------------------------------------------------------------------------------------------------------|:------------------|
|
312 |
| <code>08f9b863-10b7-46d6-badd-97381e6c7c5e_4330efa7-a11b-4776-9fb0-1cae8aed67b1</code> | <code>Title: Blast Zone<br>Type: Land<br>Desc: This land enters with a charge counter on it.<br>{T}: Add {C}.<br>{X}{X}, {T}: Put X charge counters on this land.<br>{3}, {T}, Sacrifice this land: Destroy each nonland permanent with mana value equal to the number of charge counters on this land.</code> | <code>Title: Tom van de Logt Bio (2000)<br>Type: Card<br>Desc: Quarterfinalist Tom van de Logt posted a perfect 6—0 record during the Standard portion of this year's World Championships. The 19-year-old Groesbeek, Holland native was playing a deck that had a big impact on the metagame this year, "Replenish." This deck used cards like Attunement and Frantic Search to put powerful enchantments, such as Parallax Wave and Opalescence, into the graveyard and then used Replenish to put them all back into play at once.</code> | <code>https://cards.scryfall.io/normal/front/0/8/08f9b863-10b7-46d6-badd-97381e6c7c5e.jpg?1674423042</code> | <code>https://cards.scryfall.io/normal/front/4/3/4330efa7-a11b-4776-9fb0-1cae8aed67b1.jpg?1562767017</code> | <code>0.25</code> |
|
313 |
+
| <code>abe9cf1e-d398-41e0-8b11-afe1015e4fd9_40cb67f7-b4e1-423b-8f55-d44ed383e778</code> | <code>Title: Coral Net<br>Cost: {U}<br>Colors: ['U']<br>Type: Enchantment — Aura<br>Desc: Enchant green or white creature<br>Enchanted creature has "At the beginning of your upkeep, sacrifice this creature unless you discard a card."</code> | <code>Title: Silumgar Butcher<br>Cost: {4}{B}<br>Colors: ['B']<br>Type: Creature — Zombie Djinn<br>Desc: Exploit (When this creature enters, you may sacrifice a creature.)<br>When this creature exploits a creature, target creature gets -3/-3 until end of turn.</code> | <code>https://cards.scryfall.io/normal/front/a/b/abe9cf1e-d398-41e0-8b11-afe1015e4fd9.jpg?1562631469</code> | <code>https://cards.scryfall.io/normal/front/4/0/40cb67f7-b4e1-423b-8f55-d44ed383e778.jpg?1562785294</code> | <code>-1.0</code> |
|
314 |
+
| <code>3dd13408-b4db-42e7-bf3c-d46716538a7c_05a6dc90-3997-4911-8bd6-854c85eca35b</code> | <code>Title: Rishadan Brigand<br>Cost: {4}{U}<br>Colors: ['U']<br>Type: Creature — Human Pirate<br>Desc: Flying<br>When this creature enters, each opponent sacrifices a permanent of their choice unless they pay {3}.<br>This creature can block only creatures with flying.</code> | <code>Title: Banishing Stroke<br>Cost: {5}{W}<br>Colors: ['W']<br>Type: Instant<br>Desc: Put target artifact, creature, or enchantment on the bottom of its owner's library.<br>Miracle {W} (You may cast this card for its miracle cost when you draw it if it's the first card you drew this turn.)</code> | <code>https://cards.scryfall.io/normal/front/3/d/3dd13408-b4db-42e7-bf3c-d46716538a7c.jpg?1632145390</code> | <code>https://cards.scryfall.io/normal/front/0/5/05a6dc90-3997-4911-8bd6-854c85eca35b.jpg?1723433851</code> | <code>-1.0</code> |
|
315 |
+
* Loss: [<code>CosineSimilarityLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#cosinesimilarityloss) with these parameters:
|
316 |
```json
|
317 |
{
|
318 |
+
"loss_fct": "torch.nn.modules.loss.MSELoss"
|
|
|
319 |
}
|
320 |
```
|
321 |
|
322 |
### Evaluation Dataset
|
323 |
|
324 |
+
#### mtg_cards-2025-04-04
|
325 |
|
326 |
+
* Dataset: [mtg_cards-2025-04-04](https://huggingface.co/datasets/philipp-zettl/mtg_cards-2025-04-04) at [a35ccc4](https://huggingface.co/datasets/philipp-zettl/mtg_cards-2025-04-04/tree/a35ccc4221eea5c0e29bba1c65d52b53c8f9d3ee)
|
327 |
* Size: 74,730 evaluation samples
|
328 |
* Columns: <code>uuid</code>, <code>sentence_1</code>, <code>sentence_2</code>, <code>image_1</code>, <code>image_2</code>, and <code>score</code>
|
329 |
* Approximate statistics based on the first 1000 samples:
|
330 |
+
| | uuid | sentence_1 | sentence_2 | image_1 | image_2 | score |
|
331 |
+
|:--------|:----------------------------------------------------------------------------------|:------------------------------------------------------------------------------------|:------------------------------------------------------------------------------------|:-----------------------------------------------------------------------------------|:-----------------------------------------------------------------------------------|:------------------------------------------------------------------|
|
332 |
+
| type | string | string | string | string | string | float |
|
333 |
+
| details | <ul><li>min: 50 tokens</li><li>mean: 56.9 tokens</li><li>max: 65 tokens</li></ul> | <ul><li>min: 14 tokens</li><li>mean: 68.44 tokens</li><li>max: 181 tokens</li></ul> | <ul><li>min: 15 tokens</li><li>mean: 69.49 tokens</li><li>max: 179 tokens</li></ul> | <ul><li>min: 52 tokens</li><li>mean: 58.22 tokens</li><li>max: 64 tokens</li></ul> | <ul><li>min: 52 tokens</li><li>mean: 58.21 tokens</li><li>max: 64 tokens</li></ul> | <ul><li>min: -1.0</li><li>mean: -0.44</li><li>max: 0.75</li></ul> |
|
334 |
* Samples:
|
335 |
| uuid | sentence_1 | sentence_2 | image_1 | image_2 | score |
|
336 |
|:---------------------------------------------------------------------------------------|:--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:------------------------------------------------------------------------------------------------------------|:------------------------------------------------------------------------------------------------------------|:------------------|
|
337 |
| <code>6bdd8645-aee9-44cb-acaa-2674f55cdf2f_b34bb149-2e50-462e-8b83-5c8339bb3aff</code> | <code>Title: Syr Cadian, Knight Owl<br>Cost: {3}{W}{W}<br>Colors: ['W']<br>Type: Legendary Creature — Bird Knight<br>Desc: Knightlifelink (Damage dealt by Knights you control also causes you to gain that much life.)<br>{W}: Syr Cadian gains vigilance until end of turn. Activate only from sunrise to sunset.<br>{B}: Syr Cadian gains flying until end of turn. Activate only from sunset to sunrise.</code> | <code>Title: Non-Human Cannonball<br>Cost: {2}{R}<br>Colors: ['R']<br>Type: Artifact Creature — Clown Robot<br>Desc: When this creature dies, roll a six-sided die. If the result is 4 or less, this creature deals that much damage to you.</code> | <code>https://cards.scryfall.io/normal/front/6/b/6bdd8645-aee9-44cb-acaa-2674f55cdf2f.jpg?1664317187</code> | <code>https://cards.scryfall.io/normal/front/b/3/b34bb149-2e50-462e-8b83-5c8339bb3aff.jpg?1673917877</code> | <code>0.25</code> |
|
338 |
| <code>860f4304-38f1-4c2f-a122-2590619522fd_08d6db9b-b2da-4148-aa49-8c2fecac6e32</code> | <code>Title: Hindering Light<br>Cost: {W}{U}<br>Colors: ['U', 'W']<br>Type: Instant<br>Desc: Counter target spell that targets you or a permanent you control.<br>Draw a card.</code> | <code>Title: Gleam of Resistance<br>Cost: {4}{W}<br>Colors: ['W']<br>Type: Instant<br>Desc: Creatures you control get +1/+2 until end of turn. Untap those creatures.<br>Basic landcycling {1}{W} ({1}{W}, Discard this card: Search your library for a basic land card, reveal it, put it into your hand, then shuffle.)</code> | <code>https://cards.scryfall.io/normal/front/8/6/860f4304-38f1-4c2f-a122-2590619522fd.jpg?1712353583</code> | <code>https://cards.scryfall.io/normal/front/0/8/08d6db9b-b2da-4148-aa49-8c2fecac6e32.jpg?1573505575</code> | <code>0.25</code> |
|
339 |
| <code>91b448f4-aa0c-42c7-a771-e8dd20e0520c_46f810c2-310e-42f5-ab1f-d56396cf5124</code> | <code>Title: Practiced Tactics<br>Cost: {W}<br>Colors: ['W']<br>Type: Instant<br>Desc: Choose target attacking or blocking creature. Practiced Tactics deals damage to that creature equal to twice the number of creatures in your party. (Your party consists of up to one each of Cleric, Rogue, Warrior, and Wizard.)</code> | <code>Title: Anointer Priest<br>Cost: {1}{W}<br>Colors: ['W']<br>Type: Creature — Human Cleric<br>Desc: Whenever a creature token you control enters, you gain 1 life.<br>Embalm {3}{W} ({3}{W}, Exile this card from your graveyard: Create a token that's a copy of it, except it's a white Zombie Human Cleric with no mana cost. Embalm only as a sorcery.)</code> | <code>https://cards.scryfall.io/normal/front/9/1/91b448f4-aa0c-42c7-a771-e8dd20e0520c.jpg?1604192922</code> | <code>https://cards.scryfall.io/normal/front/4/6/46f810c2-310e-42f5-ab1f-d56396cf5124.jpg?1599769231</code> | <code>0.25</code> |
|
340 |
+
* Loss: [<code>CosineSimilarityLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#cosinesimilarityloss) with these parameters:
|
341 |
```json
|
342 |
{
|
343 |
+
"loss_fct": "torch.nn.modules.loss.MSELoss"
|
|
|
344 |
}
|
345 |
```
|
346 |
|
|
|
356 |
- `log_level_replica`: passive
|
357 |
- `log_on_each_node`: False
|
358 |
- `logging_nan_inf_filter`: False
|
359 |
+
- `push_to_hub`: True
|
360 |
+
- `resume_from_checkpoint`: ./models/gte-micro-v4-mtg/
|
361 |
+
- `hub_model_id`: philipp-zettl/gte-micro-v4-mtg
|
362 |
+
- `hub_always_push`: True
|
363 |
- `batch_sampler`: no_duplicates
|
364 |
|
365 |
#### All Hyperparameters
|
|
|
404 |
- `jit_mode_eval`: False
|
405 |
- `use_ipex`: False
|
406 |
- `bf16`: False
|
407 |
+
- `fp16`: False
|
408 |
- `fp16_opt_level`: O1
|
409 |
- `half_precision_backend`: auto
|
410 |
- `bf16_full_eval`: False
|
|
|
444 |
- `dataloader_persistent_workers`: False
|
445 |
- `skip_memory_metrics`: True
|
446 |
- `use_legacy_prediction_loop`: False
|
447 |
+
- `push_to_hub`: True
|
448 |
+
- `resume_from_checkpoint`: ./models/gte-micro-v4-mtg/
|
449 |
+
- `hub_model_id`: philipp-zettl/gte-micro-v4-mtg
|
450 |
- `hub_strategy`: every_save
|
451 |
- `hub_private_repo`: None
|
452 |
+
- `hub_always_push`: True
|
453 |
- `gradient_checkpointing`: False
|
454 |
- `gradient_checkpointing_kwargs`: None
|
455 |
- `include_inputs_for_metrics`: False
|
|
|
485 |
</details>
|
486 |
|
487 |
### Training Logs
|
488 |
+
| Epoch | Step | Training Loss | Validation Loss | sts-dev_spearman_cosine | sts-test_spearman_cosine |
|
489 |
+
|:------:|:-----:|:-------------:|:---------------:|:-----------------------:|:------------------------:|
|
490 |
+
| -1 | -1 | - | - | 0.3315 | - |
|
491 |
+
| 0.0113 | 500 | 1.4254 | - | - | - |
|
492 |
+
| 0.0225 | 1000 | 0.3809 | - | - | - |
|
493 |
+
| 0.0338 | 1500 | 0.3494 | - | - | - |
|
494 |
+
| 0.0451 | 2000 | 0.3481 | - | - | - |
|
495 |
+
| 0.0563 | 2500 | 0.3466 | - | - | - |
|
496 |
+
| 0.0676 | 3000 | 0.3475 | - | - | - |
|
497 |
+
| 0.0789 | 3500 | 0.3467 | - | - | - |
|
498 |
+
| 0.0901 | 4000 | 0.3467 | - | - | - |
|
499 |
+
| 0.1014 | 4500 | 0.348 | - | - | - |
|
500 |
+
| 0.1127 | 5000 | 0.3469 | 0.3448 | 0.6769 | - |
|
501 |
+
| 0.1240 | 5500 | 0.3493 | - | - | - |
|
502 |
+
| 0.1352 | 6000 | 0.3463 | - | - | - |
|
503 |
+
| 0.1465 | 6500 | 0.3457 | - | - | - |
|
504 |
+
| 0.1578 | 7000 | 0.3449 | - | - | - |
|
505 |
+
| 0.1690 | 7500 | 0.3432 | - | - | - |
|
506 |
+
| 0.1803 | 8000 | 0.3424 | - | - | - |
|
507 |
+
| 0.1916 | 8500 | 0.3443 | - | - | - |
|
508 |
+
| 0.2028 | 9000 | 0.344 | - | - | - |
|
509 |
+
| 0.2141 | 9500 | 0.3466 | - | - | - |
|
510 |
+
| 0.2254 | 10000 | 0.3421 | 0.3449 | 0.6726 | - |
|
511 |
+
| 0.2366 | 10500 | 0.3422 | - | - | - |
|
512 |
+
| 0.2479 | 11000 | 0.3439 | - | - | - |
|
513 |
+
| 0.2592 | 11500 | 0.3454 | - | - | - |
|
514 |
+
| 0.2704 | 12000 | 0.3476 | - | - | - |
|
515 |
+
| 0.2817 | 12500 | 0.3461 | - | - | - |
|
516 |
+
| 0.2930 | 13000 | 0.3483 | - | - | - |
|
517 |
+
| 0.3043 | 13500 | 0.344 | - | - | - |
|
518 |
+
| 0.3155 | 14000 | 0.3496 | - | - | - |
|
519 |
+
| 0.3268 | 14500 | 0.3448 | - | - | - |
|
520 |
+
| 0.3381 | 15000 | 0.3462 | 0.3442 | 0.6632 | - |
|
521 |
+
| 0.3493 | 15500 | 0.3446 | - | - | - |
|
522 |
+
| 0.3606 | 16000 | 0.3443 | - | - | - |
|
523 |
+
| 0.3719 | 16500 | 0.3444 | - | - | - |
|
524 |
+
| 0.3831 | 17000 | 0.3452 | - | - | - |
|
525 |
+
| 0.3944 | 17500 | 0.3467 | - | - | - |
|
526 |
+
| 0.4057 | 18000 | 0.3439 | - | - | - |
|
527 |
+
| 0.4169 | 18500 | 0.3437 | - | - | - |
|
528 |
+
| 0.4282 | 19000 | 0.3426 | - | - | - |
|
529 |
+
| 0.4395 | 19500 | 0.3435 | - | - | - |
|
530 |
+
| 0.4507 | 20000 | 0.3453 | 0.3443 | 0.6550 | - |
|
531 |
+
| 0.4620 | 20500 | 0.3439 | - | - | - |
|
532 |
+
| 0.4733 | 21000 | 0.3434 | - | - | - |
|
533 |
+
| 0.4846 | 21500 | 0.3477 | - | - | - |
|
534 |
+
| 0.4958 | 22000 | 0.3471 | - | - | - |
|
535 |
+
| 0.5071 | 22500 | 0.3468 | - | - | - |
|
536 |
+
| 0.5184 | 23000 | 0.3453 | - | - | - |
|
537 |
+
| 0.5296 | 23500 | 0.3447 | - | - | - |
|
538 |
+
| 0.5409 | 24000 | 0.3441 | - | - | - |
|
539 |
+
| 0.5522 | 24500 | 0.3459 | - | - | - |
|
540 |
+
| 0.5634 | 25000 | 0.3431 | 0.3447 | 0.6558 | - |
|
541 |
+
| 0.5747 | 25500 | 0.3435 | - | - | - |
|
542 |
+
| 0.5860 | 26000 | 0.3464 | - | - | - |
|
543 |
+
| 0.5972 | 26500 | 0.3436 | - | - | - |
|
544 |
+
| 0.6085 | 27000 | 0.3446 | - | - | - |
|
545 |
+
| 0.6198 | 27500 | 0.3401 | - | - | - |
|
546 |
+
| 0.6310 | 28000 | 0.347 | - | - | - |
|
547 |
+
| 0.6423 | 28500 | 0.3412 | - | - | - |
|
548 |
+
| 0.6536 | 29000 | 0.3427 | - | - | - |
|
549 |
+
| 0.6648 | 29500 | 0.3423 | - | - | - |
|
550 |
+
| 0.6761 | 30000 | 0.3407 | 0.3418 | 0.6612 | - |
|
551 |
+
| 0.6874 | 30500 | 0.3404 | - | - | - |
|
552 |
+
| 0.6987 | 31000 | 0.3413 | - | - | - |
|
553 |
+
| 0.7099 | 31500 | 0.3434 | - | - | - |
|
554 |
+
| 0.7212 | 32000 | 0.3437 | - | - | - |
|
555 |
+
| 0.7325 | 32500 | 0.3442 | - | - | - |
|
556 |
+
| 0.7437 | 33000 | 0.3413 | - | - | - |
|
557 |
+
| 0.7550 | 33500 | 0.3441 | - | - | - |
|
558 |
+
| 0.7663 | 34000 | 0.3387 | - | - | - |
|
559 |
+
| 0.7775 | 34500 | 0.3416 | - | - | - |
|
560 |
+
| 0.7888 | 35000 | 0.3409 | 0.3392 | 0.6554 | - |
|
561 |
+
| 0.8001 | 35500 | 0.3414 | - | - | - |
|
562 |
+
| 0.8113 | 36000 | 0.338 | - | - | - |
|
563 |
+
| 0.8226 | 36500 | 0.3385 | - | - | - |
|
564 |
+
| 0.8339 | 37000 | 0.3391 | - | - | - |
|
565 |
+
| 0.8451 | 37500 | 0.3381 | - | - | - |
|
566 |
+
| 0.8564 | 38000 | 0.3372 | - | - | - |
|
567 |
+
| 0.8677 | 38500 | 0.3391 | - | - | - |
|
568 |
+
| 0.8790 | 39000 | 0.3404 | - | - | - |
|
569 |
+
| 0.8902 | 39500 | 0.3399 | - | - | - |
|
570 |
+
| 0.9015 | 40000 | 0.3413 | 0.3376 | 0.6572 | - |
|
571 |
+
| 0.9128 | 40500 | 0.3408 | - | - | - |
|
572 |
+
| 0.9240 | 41000 | 0.342 | - | - | - |
|
573 |
+
| 0.9353 | 41500 | 0.3389 | - | - | - |
|
574 |
+
| 0.9466 | 42000 | 0.3375 | - | - | - |
|
575 |
+
| 0.9578 | 42500 | 0.3378 | - | - | - |
|
576 |
+
| 0.9691 | 43000 | 0.3386 | - | - | - |
|
577 |
+
| 0.9804 | 43500 | 0.3377 | - | - | - |
|
578 |
+
| 0.9916 | 44000 | 0.3362 | - | - | - |
|
579 |
+
| -1 | -1 | - | - | - | 0.6549 |
|
|
|
|
|
|
|
|
|
|
|
|
|
580 |
|
581 |
|
582 |
### Framework Versions
|
|
|
605 |
}
|
606 |
```
|
607 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
608 |
<!--
|
609 |
## Glossary
|
610 |
|