--- tags: - sentence-transformers - sentence-similarity - feature-extraction - generated_from_trainer - dataset_size:92337 - loss:MultipleNegativesRankingLoss base_model: Alibaba-NLP/gte-modernbert-base widget: - source_sentence: 'Philosophy Professor Running for US Senate! Richard Dien Winfield is a philosophy professor at UGA and is a distinguished Hegel scholar. I''m sure many of you are familiar with his many published books on Hegel and his audio lectures, available here: is currently running for US Senate for the state of Georgia and is just beginning his campaign. Check out his campaign launch video: his website: can follow him on Twitter: is advocating for a new social bill of rights. Here is where he is on just some of the issues: Federal Job Guarantee Fair Minimum Wage ($20/hr) Green New Deal Super Medicare for All Legal Care for All It''s great to see a philosophy professor running for office!' sentences: - Context and Context-Dependence - 'There are some contests about this, but: “Atheism” is typically defined in terms of “theism”. Theism, in turn, is best understood as a proposition—something that is either true or false. It is often defined as “the belief that God exists”, but here “belief” means “something believed”. It refers to the propositional content of belief, not to the attitude or psychological state of believing. This is why it makes sense to say that theism is true or false and to argue for or against theism. If, however, “atheism” is defined in terms of theism and theism is the proposition that God exists and not the psychological condition of believing that there is a God, then it follows that atheism is not the absence of the psychological condition of believing that God exists (more on this below). The “a-” in “atheism” must be understood as negation instead of absence, as “not” instead of “without”. Therefore, in philosophy at least, atheism should be construed as the proposition that God does not exist (or, more broadly, the proposition that there are no gods). rf: sometimes you hear the other thing - that atheism is the position wherein some believer "lacks the belief" that there are gods. This position is popular outside of philosophy. There are lots of good reasons to reject this alternative situation, as it tends to confuse other positions. You can read why here: https://old.reddit.com/r/askphilosophy/comments/2za4ez/vacuous_truths_and_shoe_atheism/cuyn8nm/' - 'If we are being honest, he has a rather slim chance to win the primary (the establishmenz seems to have anointed their candidate, and it''s not a bad one), and an uphill general election (last Senate election was R+12). And he doesn''t seem to say the right things in public.... Yeah. Edit: In my view, philosophers (and anyone) should very much think about whether running for office, in this situation, is the right call. Given the issues Winfield wants to push, it seems like funding an NGO or a think tank would be much better suited to forward the cause. Especially since the job guarantee is already authorized (but not mandated) by law, which is a fun quirk. But then again, just looking at the website, you can tell that this campaign isn''t run by professionals? It just sounds like a waste of time and resources. If you ever Plan to run for office, hire me to give you advice (I''ll do it for free if you''re not a Hegelian)' - source_sentence: 'Human embryos are fully human in a biological sense, they''re just at a particular stage of human development (a developing human, as opposed to non-humans becoming human), but maybe that''s what you meant. You''re effectively asking whether there are any pro-life philosophers, and there are. Here are some: Francis J. Beckwith, Stephen D. Schwarz, Christopher Kaczor, Don Marquis, Patrick Lee, and Jack Mulder. Most tend to be Catholics, but none assume or employ religious doctrines in making the case against abortion. I myself happen to be non-religious but pro-life, for example. If you have access to a university library, you can find plenty more authors who''ve written articles effectively defending the view you mentioned. Most of the authors I listed above have written books and articles that are more accessible, though.' sentences: - 'Just to add a few more papers to this: Manninen, B.A. (2007) “Revisiting the Argument from Foetal Potential”. Philosophy, Ethics and Humanities in Medicine. Volume 2, Number 1. Available at: P. George, R.P. (2005) “The Wrong of Abortion”. Contemporary Debates in Applied Ethics, pp.13-26. Cohen, A.I.; Wellman, C.H. (eds.) Malden, MA: Blackwell Pub.' - I myself happen to be non-religious but pro-life I am too lol, we're quite a rare breed. Are any of those authors openly non religious? I'll check some of them out anyway. - That last bit sounds right up the alley of what I'm looking at. Do you happen to know of any papers that elaborate any of these contentions? - source_sentence: Is there something morally wrong with cultural appropriation in the arts? I argue that the little philosophical work on this topic has been overly dismissive of moral objections to cultural appropriation. Nevertheless, I argue that philosophers working on epistemic injustice have developed powerful conceptual tools that can aid in our understanding of objections that have been levied by other scholars and artists. I then consider the relationship between these objections and the harms of cultural essentialism. I argue that focusing on (...) the systematic nature of appropriative harms may allow us to sidestep the problem of essentialism, but not without cost. (shrink) sentences: - Then, I(g, ( A Af -, y) -, (c -,+ (f -, y))) = o ensures that there is a w where I(w, c A AP -+ y) = 1 and I(w, a --+ (f -+ y)) = 0. This gives us x, y where Rwxy, I(x, a) = 1 and I(y, P -+ y) = 0, which in turn means that there are z, t where Ryzt, I(z, P) = 1 and I(t, y) = 0. - Applied Ethics, Misc - Defenses of Toleration - source_sentence: Whereas it may be said that in Britain freedom is regarded, in a sense, as a privilege bestowed from above, from the upper orders of society, in the United States it is regarded as a 'natural right'; an argument not really vitiated by pointing out that the natural rights doctrine is nonsense, for Americans throughout a great part of their history have believed that they do enjoy these rights. sentences: - '. Art is something unique, the spirit is unique in its source. Art is symbolic, since it always bears within itself a symbol, i.e., that which is eternal, and rejects that which is transitory. Art is free, since it arises from inspiration. "16 Bryusov and Berdyaev protested that they could only recognize socialist ideas to the extent that socialism respects the basic principle of their world view: the unconditional independence of the artist. The opinions of the Russian symbolists were a reflection of French Symbolism and Parnassism. At its base is the Romantic theory of art, derived from Novalis and F. Schlegel, according to which the artist is a Brahmin. (At the same time, a trend in Romanticism stressed the social obligations of the artist, e.g., Shelley in Defence of Poetry.) This apolitical tendency in Romanticism was inherited by the representatives of the idea of "art for art''s sake." Baudelaire wrote, in the Hymne: "Que tu viennes du Ciel ou de l''Enfer, qu''importe, o Beauté." Flaubert thought: "Aimonsnous en l''art comme les mystiques s''aiment en Dieu" (CorresponRevolution on the Development of Russian Esthetic Thinking," in Uchonie zapiski L.G.U., S istoricheskikh nauk, 1956, No. 220. Vol. II, 1893, p. 286) . The Goncourts wrote in their Journal in 1886 that only "pure literature" is a matter of life and death. And we are continually coming across statements in their writings concerning the eternality of the "truly beautiful," the independence of the artist, his superiority and disdain towards his clients and customers. Baudelaire wrote about Poe, and Gautier about Baudelaire, that the glory of the poet is his holding himself aloof from Utopians, philanthropists, socialists, etc.' - Maybe the guy in the room can not be said to understand Chinese but the room, itself, could. That's not to say that the room is conscious, though. - His next major contribution to the literature of educational reform appeared in 1845, in a series of articles in The Scotsman, entitled 'National Education and the Common Schools of Massachusetts'. In I847 these came out in pamphlet form as Remarks on National Education4 This was largely a re-statement of the argument advanced in his Edinburgh Review article, prefaced by an assurance to his readers that they had nothing to fear from the Government's being entrusted to run the nation's schools. 'In every free country the state is merely the representative of the general power (physical, moral and intellectual) of the country. It is not a distant and independent being, that can exist and A Note On 'Secular5 Education In The Nineteenth Century In Spite The Will Of Its Members. This was addressed to those who saw in Governmental interference in education the spectre of despotism rearing its head, a not unfounded fear since it was upon the Prussian example that Mann had erected his Massachusetts plan. Society's claim on the individual was not, however, a total one, for 'the individual has a right to unbounded liberty of self-determination as to what he shall learn and what he shall not learn. '2 Combe made no attempt to reconcile these two propositions. - source_sentence: I found Kant and the Problem of Metaphysics to be a good intro to his style (assuming you know Kant well) sentences: - I'm going on a hunch here and guessing that your professor was referring to Rorty's pragmatism, in which case this would seek to eliminate, or at least redesign, much of metaphysics and epistemology (primarily) but not philosophy in its entirety. - I would second this recommendation; Polt is a wonderful reader of Heidegger. I think he’s right to say that his Introduction is best read alongside B T. - Good suggestion, I'm currently working on his rather complicated Auseinandersetzung with Kant. It's baffling how insightful and at the same time how rash he can be! ;) pipeline_tag: sentence-similarity library_name: sentence-transformers metrics: - cosine_accuracy model-index: - name: SentenceTransformer based on Alibaba-NLP/gte-modernbert-base results: - task: type: triplet name: Triplet dataset: name: Unknown type: unknown metrics: - type: cosine_accuracy value: 0.9693415637860082 name: Cosine Accuracy --- # SentenceTransformer based on Alibaba-NLP/gte-modernbert-base This is a [sentence-transformers](https://www.SBERT.net) model finetuned from [Alibaba-NLP/gte-modernbert-base](https://huggingface.co/Alibaba-NLP/gte-modernbert-base). It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more. ## Model Details ### Model Description - **Model Type:** Sentence Transformer - **Base model:** [Alibaba-NLP/gte-modernbert-base](https://huggingface.co/Alibaba-NLP/gte-modernbert-base) - **Maximum Sequence Length:** 8192 tokens - **Output Dimensionality:** 768 dimensions - **Similarity Function:** Cosine Similarity ### Model Sources - **Documentation:** [Sentence Transformers Documentation](https://sbert.net) - **Repository:** [Sentence Transformers on GitHub](https://github.com/UKPLab/sentence-transformers) - **Hugging Face:** [Sentence Transformers on Hugging Face](https://huggingface.co/models?library=sentence-transformers) ### Full Model Architecture ``` SentenceTransformer( (0): Transformer({'max_seq_length': 8192, 'do_lower_case': False}) with Transformer model: ModernBertModel (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True}) ) ``` ## Usage ### Direct Usage (Sentence Transformers) First install the Sentence Transformers library: ```bash pip install -U sentence-transformers ``` Then you can load this model and run inference. ```python from sentence_transformers import SentenceTransformer # Download from the 🤗 Hub model = SentenceTransformer("sentence_transformers_model_id") # Run inference sentences = [ 'I found Kant and the Problem of Metaphysics to be a good intro to his style (assuming you know Kant well)', "Good suggestion, I'm currently working on his rather complicated Auseinandersetzung with Kant. It's baffling how insightful and at the same time how rash he can be! ;)", 'I would second this recommendation; Polt is a wonderful reader of Heidegger. I think he’s right to say that his Introduction is best read alongside B T.', ] embeddings = model.encode(sentences) print(embeddings.shape) # [3, 768] # Get the similarity scores for the embeddings similarities = model.similarity(embeddings, embeddings) print(similarities.shape) # [3, 3] ``` ## Evaluation ### Metrics #### Triplet * Evaluated with [TripletEvaluator](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.TripletEvaluator) | Metric | Value | |:--------------------|:-----------| | **cosine_accuracy** | **0.9693** | ## Training Details ### Training Dataset #### Unnamed Dataset * Size: 92,337 training samples * Columns: anchor, positive, and negative * Approximate statistics based on the first 1000 samples: | | anchor | positive | negative | |:--------|:--------------------------------------------------------------------------------------|:-------------------------------------------------------------------------------------|:-------------------------------------------------------------------------------------| | type | string | string | string | | details | | | | * Samples: | anchor | positive | negative | |:---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| | This article presents Derrida as a philosopher of history by reinterpreting his De la Grammatologie. In particular, it provides a schematic reconstruction of Part II of that book from the perspective of the problem of history. My account extends work on historicity in Derrida by privileging the themes of ‘history’ and ‘diagram’ in the Rousseau part. I thereby establish a Derridean concept of history which aims at accounting for the continuities and discontinuities of the past. This is in contrast to (...) some criticism that Derrida leaves behind, or inadequately accounts for history. Derrida describes a necessarily contorted condition of relating any historical event or development to itself or to another. This historicity informs other well-known aspects of Derrida's work, like the ‘quasi-transcendental’ terms he developed. I conclude that ‘history’ is a critical element in any understanding of deconstruction, and that deconstruction entails new kinds of history, but that some axioms... | Derrida and Other Philosophers | Derrida: Animals | | It is a struggle against the French policy of oppression, of economic impoverishment, and of social, cultural and political destruction of the Algerian people. Sartre, as a citizen of the oppressive state, chose to say Yes to the violent struggle of the oppressed, and No to the French violence. He explains that the aim of French violence is not only: the keeping of these enslaved men at arm's length; it seeks to dehumanize them. Everything will be done to wipe out their traditions, to substitute our language for theirs and to destroy their culture without giving them ours. Sheer physical fatigue will stupefy them. Starved and ill, if they have any spirit left, fear will finish the job; guns are levelled at the peasant; civilians come to take over his land and force him, by dint of flogging, to till the land for them. If he shows fight, the soldiers fire and he is no longer a man at all.2 Thus, the Algerians were forced to choose between two possibilities: slavery or freedom. They chose... | To avoid any political recognition of the freedom of the oppressed, and of their national independence, the oppressors will appeal to the status quo. Yet the colonialist prefers to evoke possibilities of social improvement because he knows that the demands of the natives are primarily political. And they are primarily political because the natives are aware that 'polities', in the colonies, is quite simply the installation and the regular functioning of an enormous repressive apparatus which alone permits super-exploitation. (CDR note 721) A Response To Hannah Arendt Arendt does not mention Sartre's political approach. It seems that she views the violent struggle of the national liberation movements as an antipolitical act, a destruction of the political realm, which she called the only true public realm. She repeatedly explains that the public realm is created by an exchange of opinions and dialogue between people, and by constituting laws that ensure stability and permanence. Violenc... | system and her [CIO] system diffel only in this respect. Thus, the "arbitrary 'bite' " [Mayo, p. 278], which Mayo finds objectionable in the [CIalt.] system is due to the satisfaction of a condition proposed by Neyman, a condition [CIO] stands in violation of.10 Lastly, on pp. 58-63 of my book, I offer a rebuttal to the objection discussed here, the objection that estimates labeled "best" by N-P standards may be deficient with respect to the legitimate concern to avoid conflicts between confidence levels and known (precise) probabilities. I base the rebuttal on a novel criterion: confidence equivalence. Perhaps others will find that defense adequate to excuse the triviality of (some) N-P "best" procedures. I do not. Nor do I find Mayo's proposals sufficient for the question at hand. 8This is Neyman's condition (ii) (Neyman 1937, p. 267). He uses it to eliminate a candidate estimation system, his #(1), pp. 269-70. 9I have recently discovered that R. von Mises observed this same difficul... | | This paper explores the managerial aspects of the relationship with stakeholders, under the assumption that transfer of knowledge is being made from relationship marketing and market orientation perspectives. These marketing tools may prove useful to manage the relationship with other stakeholders, as has been the case with customers. This study focuses on a sample of Spanish companies representing 43% of listed companies with the largest market capitalization. Given that this is the first time that corporate relationship with stakeholders is analyzed (...) in Spain, a qualitative technique (case analysis) was used. The main conclusion of the study is that most of the participant companies have a reactive position vis-à-vis stakeholders management systems. This attitude is reflected in their concern exclusively about ethical indexes managers. (shrink) | Business Ethics | Specific Freedoms, Misc | * Loss: [MultipleNegativesRankingLoss](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#multiplenegativesrankingloss) with these parameters: ```json { "scale": 20.0, "similarity_fct": "cos_sim" } ``` ### Evaluation Dataset #### Unnamed Dataset * Size: 4,860 evaluation samples * Columns: anchor, positive, and negative * Approximate statistics based on the first 1000 samples: | | anchor | positive | negative | |:--------|:-------------------------------------------------------------------------------------|:-------------------------------------------------------------------------------------|:-------------------------------------------------------------------------------------| | type | string | string | string | | details | | | | * Samples: | anchor | positive | negative | |:---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| | He next proceeds to definitions of logically necessary propositions and of general propositions. He then examines with some care the notion of an empirical law of nature, and concludes that such a law is neither an entailment proposition nor a collective general proposition. His own account is that an empirical law is a conjunction of a logical statement to the effect that the relation between two ostensive concepts, say 'P' and 'Q,' is one of inclusion-or-overlap (or, in the case of '-Q,' of exclusion-oroverlap), and of a factual statement that as a matter of fact everything is either not a P or else a Q. In spite of the ingenuity of this view two criticisms occur at once. How could the factual statement be known? Even if known, how would it justify the BOOK REVIEW 51 counterfactual statement, "If anything were a P, it would be a Q" In discussing the use of logical propositions in deductive reasoning and definitions, K6rner admits that we commonly make use of excessive entailments, e.... | It applies at most to only a subset of beautiful things. K6rner also states an embryonic theory of moral action. He holds that the proposition, "a person, X, states that an action b, which has naturalistic characteristics 'P,' is moral," is equivalent to the proposition " X applies 'P' to b and accepts 'P' practically." To accept a concept practically is to desire an action which satisfies the concept and to desire that everyone else desire like actions in like circumstances. His view here has the advantage of recognizing the implicit universalizability of moral statefnents, but otherwise it seems to be only a slightly modified emotive or attitudinal theory. Thus, for example, he says that a person "believes in" a purely teleological ethics if he practically accepts only teleological concepts and rules. In a final group of four chapters K6rner turns his attention to metaphysical "directives. " These are not propositions but BOOK REVIEW 53 rules. They include in each case a rule for the... | The soul is invisible, because it is that which does the seeing. Its externally given and appearing image is the physical, which may be studied in abstraction as if it were real in and for itself. Philosophy of science is the logical reflection on and of the meaning of existence qua scientifically knowing in correlation with the objects of the sciences. In opposition to irrationalism, for which logic falsifies immediate experience, scientific method criticizes immediate experience for being logically absurd : The sun rises in the East, wanders through the sky, and settles in the West ; this is true in immediate experience, but false in astronomy. The scientific worldview sees the world as object for a knowing subject ; perceptual data are thought in the logical form of concepts related in propositions. The principle of scientific reason assumes a partial identity between the logical forms in the mind and the same logical forms as determining objects of knowledge ; that which makes the ... | | Educating the gaze is easily understood as becoming conscious about what is 'really' happening in the world and becoming aware of the way our gaze is itself bound to a perspective and particular position. However, the paper explores a different idea. It understands educating the gaze not in the sense of 'educare' (teaching) but of 'e-ducere' as leading out, reaching out. E-ducating the gaze is not about getting at a liberated or critical view, but about liberating or displacing our view. (...) It is not about becoming conscious or aware , but about becoming attentive , about paying attention . E-ducating the gaze, then, is not depending on method, but relying on discipline; it does not require a rich methodology, but asks for a poor pedagogy, i.e. for practices which allow to expose ourselves. One example of such practice is that of walking. Consequently e-ducating the gaze could be about an invitation to go walking. This idea is explored b way of a comment on two quotations, one by Wa... | Applied Ethics | Modal and Intensional Logic, Misc | | There is still no consensus. Furthermore, it is also possible that Boltzmann held one view in his meth odology of science and another in what might be called ontology or his theory of nature (Blackmore 1972, 1982). But starting in 1990 a vast amount of previously unsuspected informa tion began to appear for the first time, which may have initially seemed to allow for the possibility that basic agreement on what Boltzmann's real' philosophy was might finally be attained. Ilse M. Fasol-Boltzmann, Boltzmann's granddaughter published a book in that year which included a great deal of philosophical material from him which she had translated from his original shorthand (Fasol-Boltzmann 1990). She also published eighteen of his lectures on natural philosophy plus Boltzmann's notes for them as well as several other philosophical fragments. We have included INTRODUCTION 3 a translation of three lectures plus an analysis of the remaining fifteen lectures in this anthology But the ideas revealed ... | Nevertheless, his apparent opposition to efficient causes makes one wonder just how realistic or practical his apparent or nascent world view was at that time (Fasol-Boltzmann 1990, p. 273). In conclusion, the new material and lectures clearly increase our under standing about Boltzmann the philosopher, methodologist, and mathem atician, even if a few contradictory remarks may also add to our confusion 8 JOHN BLACKMORE in some ways. But all in all the new data has added a great deal toward putting the philosophical perspective of this great man back together again. Like Humpty Dumpty who had a great fall, Boltzmann's philosophical heritage had been broken into many pieces, but all the king's scholars are rejoining enough together to gradually recreate an original and valuable outlook. Soon, many more people will appreciate that Ludwig Boltzmann was a profound if very troubled thinker and that there is a coherent system within the range of ideas which he was considering, even if he coul... | 16-24 , and a more popular exposition de l'esprit, . 18-27. | * Loss: [MultipleNegativesRankingLoss](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#multiplenegativesrankingloss) with these parameters: ```json { "scale": 20.0, "similarity_fct": "cos_sim" } ``` ### Training Hyperparameters #### Non-Default Hyperparameters - `eval_strategy`: epoch - `per_device_eval_batch_size`: 16 - `learning_rate`: 3e-05 - `num_train_epochs`: 1 - `warmup_ratio`: 0.1 - `fp16`: True - `load_best_model_at_end`: True - `ddp_find_unused_parameters`: False #### All Hyperparameters
Click to expand - `overwrite_output_dir`: False - `do_predict`: False - `eval_strategy`: epoch - `prediction_loss_only`: True - `per_device_train_batch_size`: 8 - `per_device_eval_batch_size`: 16 - `per_gpu_train_batch_size`: None - `per_gpu_eval_batch_size`: None - `gradient_accumulation_steps`: 1 - `eval_accumulation_steps`: None - `torch_empty_cache_steps`: None - `learning_rate`: 3e-05 - `weight_decay`: 0.0 - `adam_beta1`: 0.9 - `adam_beta2`: 0.999 - `adam_epsilon`: 1e-08 - `max_grad_norm`: 1.0 - `num_train_epochs`: 1 - `max_steps`: -1 - `lr_scheduler_type`: linear - `lr_scheduler_kwargs`: {} - `warmup_ratio`: 0.1 - `warmup_steps`: 0 - `log_level`: passive - `log_level_replica`: warning - `log_on_each_node`: True - `logging_nan_inf_filter`: True - `save_safetensors`: True - `save_on_each_node`: False - `save_only_model`: False - `restore_callback_states_from_checkpoint`: False - `no_cuda`: False - `use_cpu`: False - `use_mps_device`: False - `seed`: 42 - `data_seed`: None - `jit_mode_eval`: False - `use_ipex`: False - `bf16`: False - `fp16`: True - `fp16_opt_level`: O1 - `half_precision_backend`: auto - `bf16_full_eval`: False - `fp16_full_eval`: False - `tf32`: None - `local_rank`: 0 - `ddp_backend`: None - `tpu_num_cores`: None - `tpu_metrics_debug`: False - `debug`: [] - `dataloader_drop_last`: False - `dataloader_num_workers`: 0 - `dataloader_prefetch_factor`: None - `past_index`: -1 - `disable_tqdm`: False - `remove_unused_columns`: True - `label_names`: None - `load_best_model_at_end`: True - `ignore_data_skip`: False - `fsdp`: [] - `fsdp_min_num_params`: 0 - `fsdp_config`: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False} - `fsdp_transformer_layer_cls_to_wrap`: None - `accelerator_config`: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None} - `deepspeed`: None - `label_smoothing_factor`: 0.0 - `optim`: adamw_torch - `optim_args`: None - `adafactor`: False - `group_by_length`: False - `length_column_name`: length - `ddp_find_unused_parameters`: False - `ddp_bucket_cap_mb`: None - `ddp_broadcast_buffers`: False - `dataloader_pin_memory`: True - `dataloader_persistent_workers`: False - `skip_memory_metrics`: True - `use_legacy_prediction_loop`: False - `push_to_hub`: False - `resume_from_checkpoint`: None - `hub_model_id`: None - `hub_strategy`: every_save - `hub_private_repo`: None - `hub_always_push`: False - `gradient_checkpointing`: False - `gradient_checkpointing_kwargs`: None - `include_inputs_for_metrics`: False - `include_for_metrics`: [] - `eval_do_concat_batches`: True - `fp16_backend`: auto - `push_to_hub_model_id`: None - `push_to_hub_organization`: None - `mp_parameters`: - `auto_find_batch_size`: False - `full_determinism`: False - `torchdynamo`: None - `ray_scope`: last - `ddp_timeout`: 1800 - `torch_compile`: False - `torch_compile_backend`: None - `torch_compile_mode`: None - `dispatch_batches`: None - `split_batches`: None - `include_tokens_per_second`: False - `include_num_input_tokens_seen`: False - `neftune_noise_alpha`: None - `optim_target_modules`: None - `batch_eval_metrics`: False - `eval_on_start`: False - `use_liger_kernel`: False - `eval_use_gather_object`: False - `average_tokens_across_devices`: False - `prompts`: None - `batch_sampler`: batch_sampler - `multi_dataset_batch_sampler`: proportional
### Training Logs
Click to expand | Epoch | Step | Training Loss | Validation Loss | cosine_accuracy | |:------:|:-----:|:-------------:|:---------------:|:---------------:| | 0.0022 | 25 | 1.2002 | - | - | | 0.0043 | 50 | 1.1623 | - | - | | 0.0065 | 75 | 1.2012 | - | - | | 0.0087 | 100 | 1.1853 | - | - | | 0.0108 | 125 | 0.9767 | - | - | | 0.0130 | 150 | 0.865 | - | - | | 0.0152 | 175 | 0.7733 | - | - | | 0.0173 | 200 | 0.9545 | - | - | | 0.0195 | 225 | 0.8309 | - | - | | 0.0217 | 250 | 0.7514 | - | - | | 0.0238 | 275 | 0.5555 | - | - | | 0.0260 | 300 | 0.563 | - | - | | 0.0282 | 325 | 0.618 | - | - | | 0.0303 | 350 | 0.6538 | - | - | | 0.0325 | 375 | 0.5802 | - | - | | 0.0347 | 400 | 0.6568 | - | - | | 0.0368 | 425 | 0.4934 | - | - | | 0.0390 | 450 | 0.597 | - | - | | 0.0412 | 475 | 0.3812 | - | - | | 0.0433 | 500 | 0.482 | - | - | | 0.0455 | 525 | 0.5347 | - | - | | 0.0476 | 550 | 0.5012 | - | - | | 0.0498 | 575 | 0.5765 | - | - | | 0.0520 | 600 | 0.4286 | - | - | | 0.0541 | 625 | 0.5167 | - | - | | 0.0563 | 650 | 0.4791 | - | - | | 0.0585 | 675 | 0.5022 | - | - | | 0.0606 | 700 | 0.438 | - | - | | 0.0628 | 725 | 0.3995 | - | - | | 0.0650 | 750 | 0.2924 | - | - | | 0.0671 | 775 | 0.4391 | - | - | | 0.0693 | 800 | 0.4328 | - | - | | 0.0715 | 825 | 0.5658 | - | - | | 0.0736 | 850 | 0.4541 | - | - | | 0.0758 | 875 | 0.5381 | - | - | | 0.0780 | 900 | 0.4523 | - | - | | 0.0801 | 925 | 0.3522 | - | - | | 0.0823 | 950 | 0.4475 | - | - | | 0.0845 | 975 | 0.4448 | - | - | | 0.0866 | 1000 | 0.407 | - | - | | 0.0888 | 1025 | 0.4616 | - | - | | 0.0910 | 1050 | 0.4213 | - | - | | 0.0931 | 1075 | 0.465 | - | - | | 0.0953 | 1100 | 0.2964 | - | - | | 0.0975 | 1125 | 0.4414 | - | - | | 0.0996 | 1150 | 0.3508 | - | - | | 0.1018 | 1175 | 0.3362 | - | - | | 0.1040 | 1200 | 0.4953 | - | - | | 0.1061 | 1225 | 0.4041 | - | - | | 0.1083 | 1250 | 0.3773 | - | - | | 0.1105 | 1275 | 0.3574 | - | - | | 0.1126 | 1300 | 0.642 | - | - | | 0.1148 | 1325 | 0.3783 | - | - | | 0.1170 | 1350 | 0.4905 | - | - | | 0.1191 | 1375 | 0.3937 | - | - | | 0.1213 | 1400 | 0.4245 | - | - | | 0.1235 | 1425 | 0.4139 | - | - | | 0.1256 | 1450 | 0.4305 | - | - | | 0.1278 | 1475 | 0.675 | - | - | | 0.1299 | 1500 | 0.55 | - | - | | 0.1321 | 1525 | 0.4033 | - | - | | 0.1343 | 1550 | 0.4167 | - | - | | 0.1364 | 1575 | 0.3814 | - | - | | 0.1386 | 1600 | 0.5183 | - | - | | 0.1408 | 1625 | 0.3343 | - | - | | 0.1429 | 1650 | 0.4212 | - | - | | 0.1451 | 1675 | 0.4737 | - | - | | 0.1473 | 1700 | 0.4563 | - | - | | 0.1494 | 1725 | 0.4251 | - | - | | 0.1516 | 1750 | 0.3497 | - | - | | 0.1538 | 1775 | 0.3753 | - | - | | 0.1559 | 1800 | 0.4031 | - | - | | 0.1581 | 1825 | 0.4037 | - | - | | 0.1603 | 1850 | 0.4114 | - | - | | 0.1624 | 1875 | 0.3848 | - | - | | 0.1646 | 1900 | 0.5088 | - | - | | 0.1668 | 1925 | 0.4032 | - | - | | 0.1689 | 1950 | 0.3354 | - | - | | 0.1711 | 1975 | 0.4163 | - | - | | 0.1733 | 2000 | 0.3715 | - | - | | 0.1754 | 2025 | 0.3424 | - | - | | 0.1776 | 2050 | 0.3311 | - | - | | 0.1798 | 2075 | 0.4362 | - | - | | 0.1819 | 2100 | 0.4441 | - | - | | 0.1841 | 2125 | 0.3122 | - | - | | 0.1863 | 2150 | 0.3717 | - | - | | 0.1884 | 2175 | 0.3461 | - | - | | 0.1906 | 2200 | 0.4816 | - | - | | 0.1928 | 2225 | 0.4784 | - | - | | 0.1949 | 2250 | 0.4334 | - | - | | 0.1971 | 2275 | 0.3437 | - | - | | 0.1993 | 2300 | 0.4333 | - | - | | 0.2014 | 2325 | 0.3609 | - | - | | 0.2036 | 2350 | 0.3437 | - | - | | 0.2058 | 2375 | 0.4911 | - | - | | 0.2079 | 2400 | 0.3872 | - | - | | 0.2101 | 2425 | 0.276 | - | - | | 0.2122 | 2450 | 0.3318 | - | - | | 0.2144 | 2475 | 0.4833 | - | - | | 0.2166 | 2500 | 0.4656 | - | - | | 0.2187 | 2525 | 0.4232 | - | - | | 0.2209 | 2550 | 0.434 | - | - | | 0.2231 | 2575 | 0.2479 | - | - | | 0.2252 | 2600 | 0.4656 | - | - | | 0.2274 | 2625 | 0.3881 | - | - | | 0.2296 | 2650 | 0.3637 | - | - | | 0.2317 | 2675 | 0.3099 | - | - | | 0.2339 | 2700 | 0.3933 | - | - | | 0.2361 | 2725 | 0.3789 | - | - | | 0.2382 | 2750 | 0.4056 | - | - | | 0.2404 | 2775 | 0.4132 | - | - | | 0.2426 | 2800 | 0.375 | - | - | | 0.2447 | 2825 | 0.3026 | - | - | | 0.2469 | 2850 | 0.5372 | - | - | | 0.2491 | 2875 | 0.4233 | - | - | | 0.2512 | 2900 | 0.2945 | - | - | | 0.2534 | 2925 | 0.2916 | - | - | | 0.2556 | 2950 | 0.3536 | - | - | | 0.2577 | 2975 | 0.3246 | - | - | | 0.2599 | 3000 | 0.4236 | - | - | | 0.2621 | 3025 | 0.4088 | - | - | | 0.2642 | 3050 | 0.4522 | - | - | | 0.2664 | 3075 | 0.3445 | - | - | | 0.2686 | 3100 | 0.3575 | - | - | | 0.2707 | 3125 | 0.3809 | - | - | | 0.2729 | 3150 | 0.3364 | - | - | | 0.2751 | 3175 | 0.4103 | - | - | | 0.2772 | 3200 | 0.3502 | - | - | | 0.2794 | 3225 | 0.2632 | - | - | | 0.2816 | 3250 | 0.406 | - | - | | 0.2837 | 3275 | 0.4363 | - | - | | 0.2859 | 3300 | 0.2819 | - | - | | 0.2881 | 3325 | 0.3421 | - | - | | 0.2902 | 3350 | 0.269 | - | - | | 0.2924 | 3375 | 0.2902 | - | - | | 0.2946 | 3400 | 0.3548 | - | - | | 0.2967 | 3425 | 0.4575 | - | - | | 0.2989 | 3450 | 0.3942 | - | - | | 0.3010 | 3475 | 0.3537 | - | - | | 0.3032 | 3500 | 0.3672 | - | - | | 0.3054 | 3525 | 0.3502 | - | - | | 0.3075 | 3550 | 0.2545 | - | - | | 0.3097 | 3575 | 0.2544 | - | - | | 0.3119 | 3600 | 0.3443 | - | - | | 0.3140 | 3625 | 0.3784 | - | - | | 0.3162 | 3650 | 0.3828 | - | - | | 0.3184 | 3675 | 0.4032 | - | - | | 0.3205 | 3700 | 0.2556 | - | - | | 0.3227 | 3725 | 0.3352 | - | - | | 0.3249 | 3750 | 0.4054 | - | - | | 0.3270 | 3775 | 0.3049 | - | - | | 0.3292 | 3800 | 0.2223 | - | - | | 0.3314 | 3825 | 0.4878 | - | - | | 0.3335 | 3850 | 0.3015 | - | - | | 0.3357 | 3875 | 0.3816 | - | - | | 0.3379 | 3900 | 0.3334 | - | - | | 0.3400 | 3925 | 0.3724 | - | - | | 0.3422 | 3950 | 0.4217 | - | - | | 0.3444 | 3975 | 0.4339 | - | - | | 0.3465 | 4000 | 0.3642 | - | - | | 0.3487 | 4025 | 0.3819 | - | - | | 0.3509 | 4050 | 0.2796 | - | - | | 0.3530 | 4075 | 0.4277 | - | - | | 0.3552 | 4100 | 0.3407 | - | - | | 0.3574 | 4125 | 0.2781 | - | - | | 0.3595 | 4150 | 0.4274 | - | - | | 0.3617 | 4175 | 0.3609 | - | - | | 0.3639 | 4200 | 0.3476 | - | - | | 0.3660 | 4225 | 0.41 | - | - | | 0.3682 | 4250 | 0.4003 | - | - | | 0.3704 | 4275 | 0.306 | - | - | | 0.3725 | 4300 | 0.2335 | - | - | | 0.3747 | 4325 | 0.2733 | - | - | | 0.3769 | 4350 | 0.3007 | - | - | | 0.3790 | 4375 | 0.3086 | - | - | | 0.3812 | 4400 | 0.365 | - | - | | 0.3833 | 4425 | 0.3255 | - | - | | 0.3855 | 4450 | 0.3765 | - | - | | 0.3877 | 4475 | 0.2946 | - | - | | 0.3898 | 4500 | 0.3298 | - | - | | 0.3920 | 4525 | 0.3645 | - | - | | 0.3942 | 4550 | 0.2403 | - | - | | 0.3963 | 4575 | 0.28 | - | - | | 0.3985 | 4600 | 0.3814 | - | - | | 0.4007 | 4625 | 0.3419 | - | - | | 0.4028 | 4650 | 0.3374 | - | - | | 0.4050 | 4675 | 0.3511 | - | - | | 0.4072 | 4700 | 0.4339 | - | - | | 0.4093 | 4725 | 0.3441 | - | - | | 0.4115 | 4750 | 0.346 | - | - | | 0.4137 | 4775 | 0.3723 | - | - | | 0.4158 | 4800 | 0.2075 | - | - | | 0.4180 | 4825 | 0.2431 | - | - | | 0.4202 | 4850 | 0.2642 | - | - | | 0.4223 | 4875 | 0.1763 | - | - | | 0.4245 | 4900 | 0.3862 | - | - | | 0.4267 | 4925 | 0.3053 | - | - | | 0.4288 | 4950 | 0.3162 | - | - | | 0.4310 | 4975 | 0.3178 | - | - | | 0.4332 | 5000 | 0.2789 | - | - | | 0.4353 | 5025 | 0.1777 | - | - | | 0.4375 | 5050 | 0.4155 | - | - | | 0.4397 | 5075 | 0.2983 | - | - | | 0.4418 | 5100 | 0.3687 | - | - | | 0.4440 | 5125 | 0.2428 | - | - | | 0.4462 | 5150 | 0.3071 | - | - | | 0.4483 | 5175 | 0.2911 | - | - | | 0.4505 | 5200 | 0.3152 | - | - | | 0.4527 | 5225 | 0.2776 | - | - | | 0.4548 | 5250 | 0.2674 | - | - | | 0.4570 | 5275 | 0.3035 | - | - | | 0.4592 | 5300 | 0.3352 | - | - | | 0.4613 | 5325 | 0.3879 | - | - | | 0.4635 | 5350 | 0.3828 | - | - | | 0.4657 | 5375 | 0.2797 | - | - | | 0.4678 | 5400 | 0.3492 | - | - | | 0.4700 | 5425 | 0.5 | - | - | | 0.4721 | 5450 | 0.2317 | - | - | | 0.4743 | 5475 | 0.2411 | - | - | | 0.4765 | 5500 | 0.277 | - | - | | 0.4786 | 5525 | 0.4112 | - | - | | 0.4808 | 5550 | 0.5116 | - | - | | 0.4830 | 5575 | 0.3264 | - | - | | 0.4851 | 5600 | 0.3688 | - | - | | 0.4873 | 5625 | 0.3224 | - | - | | 0.4895 | 5650 | 0.3778 | - | - | | 0.4916 | 5675 | 0.3671 | - | - | | 0.4938 | 5700 | 0.3331 | - | - | | 0.4960 | 5725 | 0.3426 | - | - | | 0.4981 | 5750 | 0.2863 | - | - | | 0.5003 | 5775 | 0.5822 | - | - | | 0.5025 | 5800 | 0.2687 | - | - | | 0.5046 | 5825 | 0.3365 | - | - | | 0.5068 | 5850 | 0.4609 | - | - | | 0.5090 | 5875 | 0.3127 | - | - | | 0.5111 | 5900 | 0.2705 | - | - | | 0.5133 | 5925 | 0.3089 | - | - | | 0.5155 | 5950 | 0.3386 | - | - | | 0.5176 | 5975 | 0.3796 | - | - | | 0.5198 | 6000 | 0.4231 | - | - | | 0.5220 | 6025 | 0.3922 | - | - | | 0.5241 | 6050 | 0.3138 | - | - | | 0.5263 | 6075 | 0.3106 | - | - | | 0.5285 | 6100 | 0.188 | - | - | | 0.5306 | 6125 | 0.209 | - | - | | 0.5328 | 6150 | 0.2617 | - | - | | 0.5350 | 6175 | 0.3059 | - | - | | 0.5371 | 6200 | 0.2764 | - | - | | 0.5393 | 6225 | 0.2801 | - | - | | 0.5415 | 6250 | 0.3744 | - | - | | 0.5436 | 6275 | 0.3067 | - | - | | 0.5458 | 6300 | 0.3305 | - | - | | 0.5480 | 6325 | 0.2827 | - | - | | 0.5501 | 6350 | 0.2712 | - | - | | 0.5523 | 6375 | 0.2677 | - | - | | 0.5544 | 6400 | 0.4269 | - | - | | 0.5566 | 6425 | 0.3834 | - | - | | 0.5588 | 6450 | 0.4177 | - | - | | 0.5609 | 6475 | 0.2457 | - | - | | 0.5631 | 6500 | 0.348 | - | - | | 0.5653 | 6525 | 0.3035 | - | - | | 0.5674 | 6550 | 0.39 | - | - | | 0.5696 | 6575 | 0.366 | - | - | | 0.5718 | 6600 | 0.2299 | - | - | | 0.5739 | 6625 | 0.1737 | - | - | | 0.5761 | 6650 | 0.3773 | - | - | | 0.5783 | 6675 | 0.3409 | - | - | | 0.5804 | 6700 | 0.1739 | - | - | | 0.5826 | 6725 | 0.3462 | - | - | | 0.5848 | 6750 | 0.2976 | - | - | | 0.5869 | 6775 | 0.3246 | - | - | | 0.5891 | 6800 | 0.3808 | - | - | | 0.5913 | 6825 | 0.2926 | - | - | | 0.5934 | 6850 | 0.2709 | - | - | | 0.5956 | 6875 | 0.3777 | - | - | | 0.5978 | 6900 | 0.2834 | - | - | | 0.5999 | 6925 | 0.2965 | - | - | | 0.6021 | 6950 | 0.2399 | - | - | | 0.6043 | 6975 | 0.2936 | - | - | | 0.6064 | 7000 | 0.2674 | - | - | | 0.6086 | 7025 | 0.265 | - | - | | 0.6108 | 7050 | 0.3257 | - | - | | 0.6129 | 7075 | 0.3504 | - | - | | 0.6151 | 7100 | 0.1485 | - | - | | 0.6173 | 7125 | 0.2598 | - | - | | 0.6194 | 7150 | 0.2838 | - | - | | 0.6216 | 7175 | 0.3391 | - | - | | 0.6238 | 7200 | 0.3568 | - | - | | 0.6259 | 7225 | 0.3001 | - | - | | 0.6281 | 7250 | 0.2613 | - | - | | 0.6303 | 7275 | 0.3379 | - | - | | 0.6324 | 7300 | 0.3347 | - | - | | 0.6346 | 7325 | 0.242 | - | - | | 0.6367 | 7350 | 0.3076 | - | - | | 0.6389 | 7375 | 0.3055 | - | - | | 0.6411 | 7400 | 0.4014 | - | - | | 0.6432 | 7425 | 0.3723 | - | - | | 0.6454 | 7450 | 0.3421 | - | - | | 0.6476 | 7475 | 0.4306 | - | - | | 0.6497 | 7500 | 0.2536 | - | - | | 0.6519 | 7525 | 0.264 | - | - | | 0.6541 | 7550 | 0.1767 | - | - | | 0.6562 | 7575 | 0.259 | - | - | | 0.6584 | 7600 | 0.2761 | - | - | | 0.6606 | 7625 | 0.2934 | - | - | | 0.6627 | 7650 | 0.3055 | - | - | | 0.6649 | 7675 | 0.2532 | - | - | | 0.6671 | 7700 | 0.2942 | - | - | | 0.6692 | 7725 | 0.2048 | - | - | | 0.6714 | 7750 | 0.2884 | - | - | | 0.6736 | 7775 | 0.3598 | - | - | | 0.6757 | 7800 | 0.3318 | - | - | | 0.6779 | 7825 | 0.3058 | - | - | | 0.6801 | 7850 | 0.3395 | - | - | | 0.6822 | 7875 | 0.2973 | - | - | | 0.6844 | 7900 | 0.2741 | - | - | | 0.6866 | 7925 | 0.2493 | - | - | | 0.6887 | 7950 | 0.2966 | - | - | | 0.6909 | 7975 | 0.3207 | - | - | | 0.6931 | 8000 | 0.2501 | - | - | | 0.6952 | 8025 | 0.4028 | - | - | | 0.6974 | 8050 | 0.3549 | - | - | | 0.6996 | 8075 | 0.3805 | - | - | | 0.7017 | 8100 | 0.353 | - | - | | 0.7039 | 8125 | 0.3569 | - | - | | 0.7061 | 8150 | 0.2588 | - | - | | 0.7082 | 8175 | 0.2252 | - | - | | 0.7104 | 8200 | 0.2747 | - | - | | 0.7126 | 8225 | 0.3239 | - | - | | 0.7147 | 8250 | 0.2954 | - | - | | 0.7169 | 8275 | 0.3749 | - | - | | 0.7191 | 8300 | 0.2757 | - | - | | 0.7212 | 8325 | 0.3012 | - | - | | 0.7234 | 8350 | 0.2985 | - | - | | 0.7255 | 8375 | 0.2656 | - | - | | 0.7277 | 8400 | 0.2007 | - | - | | 0.7299 | 8425 | 0.2402 | - | - | | 0.7320 | 8450 | 0.3434 | - | - | | 0.7342 | 8475 | 0.2628 | - | - | | 0.7364 | 8500 | 0.265 | - | - | | 0.7385 | 8525 | 0.3748 | - | - | | 0.7407 | 8550 | 0.249 | - | - | | 0.7429 | 8575 | 0.3375 | - | - | | 0.7450 | 8600 | 0.3336 | - | - | | 0.7472 | 8625 | 0.3702 | - | - | | 0.7494 | 8650 | 0.3494 | - | - | | 0.7515 | 8675 | 0.2996 | - | - | | 0.7537 | 8700 | 0.2433 | - | - | | 0.7559 | 8725 | 0.3027 | - | - | | 0.7580 | 8750 | 0.382 | - | - | | 0.7602 | 8775 | 0.2874 | - | - | | 0.7624 | 8800 | 0.2737 | - | - | | 0.7645 | 8825 | 0.3212 | - | - | | 0.7667 | 8850 | 0.3475 | - | - | | 0.7689 | 8875 | 0.221 | - | - | | 0.7710 | 8900 | 0.2587 | - | - | | 0.7732 | 8925 | 0.2852 | - | - | | 0.7754 | 8950 | 0.3837 | - | - | | 0.7775 | 8975 | 0.2333 | - | - | | 0.7797 | 9000 | 0.3036 | - | - | | 0.7819 | 9025 | 0.3287 | - | - | | 0.7840 | 9050 | 0.3248 | - | - | | 0.7862 | 9075 | 0.2395 | - | - | | 0.7884 | 9100 | 0.2647 | - | - | | 0.7905 | 9125 | 0.3345 | - | - | | 0.7927 | 9150 | 0.3421 | - | - | | 0.7949 | 9175 | 0.3496 | - | - | | 0.7970 | 9200 | 0.253 | - | - | | 0.7992 | 9225 | 0.3462 | - | - | | 0.8014 | 9250 | 0.2688 | - | - | | 0.8035 | 9275 | 0.3301 | - | - | | 0.8057 | 9300 | 0.3382 | - | - | | 0.8078 | 9325 | 0.2219 | - | - | | 0.8100 | 9350 | 0.278 | - | - | | 0.8122 | 9375 | 0.2338 | - | - | | 0.8143 | 9400 | 0.2732 | - | - | | 0.8165 | 9425 | 0.2973 | - | - | | 0.8187 | 9450 | 0.2783 | - | - | | 0.8208 | 9475 | 0.2418 | - | - | | 0.8230 | 9500 | 0.2603 | - | - | | 0.8252 | 9525 | 0.1888 | - | - | | 0.8273 | 9550 | 0.2581 | - | - | | 0.8295 | 9575 | 0.2742 | - | - | | 0.8317 | 9600 | 0.2156 | - | - | | 0.8338 | 9625 | 0.3317 | - | - | | 0.8360 | 9650 | 0.1967 | - | - | | 0.8382 | 9675 | 0.1701 | - | - | | 0.8403 | 9700 | 0.3064 | - | - | | 0.8425 | 9725 | 0.3511 | - | - | | 0.8447 | 9750 | 0.2461 | - | - | | 0.8468 | 9775 | 0.3047 | - | - | | 0.8490 | 9800 | 0.3234 | - | - | | 0.8512 | 9825 | 0.2843 | - | - | | 0.8533 | 9850 | 0.3365 | - | - | | 0.8555 | 9875 | 0.3802 | - | - | | 0.8577 | 9900 | 0.2587 | - | - | | 0.8598 | 9925 | 0.2367 | - | - | | 0.8620 | 9950 | 0.2971 | - | - | | 0.8642 | 9975 | 0.2884 | - | - | | 0.8663 | 10000 | 0.2296 | - | - | | 0.8685 | 10025 | 0.3145 | - | - | | 0.8707 | 10050 | 0.178 | - | - | | 0.8728 | 10075 | 0.2681 | - | - | | 0.8750 | 10100 | 0.3191 | - | - | | 0.8772 | 10125 | 0.2544 | - | - | | 0.8793 | 10150 | 0.2965 | - | - | | 0.8815 | 10175 | 0.317 | - | - | | 0.8837 | 10200 | 0.2149 | - | - | | 0.8858 | 10225 | 0.4876 | - | - | | 0.8880 | 10250 | 0.2984 | - | - | | 0.8901 | 10275 | 0.3024 | - | - | | 0.8923 | 10300 | 0.2447 | - | - | | 0.8945 | 10325 | 0.2684 | - | - | | 0.8966 | 10350 | 0.1714 | - | - | | 0.8988 | 10375 | 0.2776 | - | - | | 0.9010 | 10400 | 0.2745 | - | - | | 0.9031 | 10425 | 0.3299 | - | - | | 0.9053 | 10450 | 0.2629 | - | - | | 0.9075 | 10475 | 0.3627 | - | - | | 0.9096 | 10500 | 0.2236 | - | - | | 0.9118 | 10525 | 0.2819 | - | - | | 0.9140 | 10550 | 0.3129 | - | - | | 0.9161 | 10575 | 0.3051 | - | - | | 0.9183 | 10600 | 0.3955 | - | - | | 0.9205 | 10625 | 0.2493 | - | - | | 0.9226 | 10650 | 0.2543 | - | - | | 0.9248 | 10675 | 0.2222 | - | - | | 0.9270 | 10700 | 0.2823 | - | - | | 0.9291 | 10725 | 0.3098 | - | - | | 0.9313 | 10750 | 0.3009 | - | - | | 0.9335 | 10775 | 0.2623 | - | - | | 0.9356 | 10800 | 0.1952 | - | - | | 0.9378 | 10825 | 0.4527 | - | - | | 0.9400 | 10850 | 0.2323 | - | - | | 0.9421 | 10875 | 0.3109 | - | - | | 0.9443 | 10900 | 0.3335 | - | - | | 0.9465 | 10925 | 0.2862 | - | - | | 0.9486 | 10950 | 0.4005 | - | - | | 0.9508 | 10975 | 0.2815 | - | - | | 0.9530 | 11000 | 0.2157 | - | - | | 0.9551 | 11025 | 0.3733 | - | - | | 0.9573 | 11050 | 0.2843 | - | - | | 0.9595 | 11075 | 0.1963 | - | - | | 0.9616 | 11100 | 0.3081 | - | - | | 0.9638 | 11125 | 0.2317 | - | - | | 0.9660 | 11150 | 0.3027 | - | - | | 0.9681 | 11175 | 0.3581 | - | - | | 0.9703 | 11200 | 0.3 | - | - | | 0.9725 | 11225 | 0.2797 | - | - | | 0.9746 | 11250 | 0.2918 | - | - | | 0.9768 | 11275 | 0.2519 | - | - | | 0.9789 | 11300 | 0.2183 | - | - | | 0.9811 | 11325 | 0.2764 | - | - | | 0.9833 | 11350 | 0.4107 | - | - | | 0.9854 | 11375 | 0.3135 | - | - | | 0.9876 | 11400 | 0.2138 | - | - | | 0.9898 | 11425 | 0.2984 | - | - | | 0.9919 | 11450 | 0.2407 | - | - | | 0.9941 | 11475 | 0.2449 | - | - | | 0.9963 | 11500 | 0.2629 | - | - | | 0.9984 | 11525 | 0.3488 | - | - | | 1.0 | 11543 | - | 0.4268 | 0.9693 |
### Framework Versions - Python: 3.10.16 - Sentence Transformers: 3.3.1 - Transformers: 4.48.0 - PyTorch: 2.4.0 - Accelerate: 1.2.1 - Datasets: 3.2.0 - Tokenizers: 0.21.0 ## Citation ### BibTeX #### Sentence Transformers ```bibtex @inproceedings{reimers-2019-sentence-bert, title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks", author = "Reimers, Nils and Gurevych, Iryna", booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing", month = "11", year = "2019", publisher = "Association for Computational Linguistics", url = "https://arxiv.org/abs/1908.10084", } ``` #### MultipleNegativesRankingLoss ```bibtex @misc{henderson2017efficient, title={Efficient Natural Language Response Suggestion for Smart Reply}, author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil}, year={2017}, eprint={1705.00652}, archivePrefix={arXiv}, primaryClass={cs.CL} } ```