metadata

tags:
  - sentence-transformers
  - sentence-similarity
  - feature-extraction
  - generated_from_trainer
  - dataset_size:190175
  - loss:MultipleNegativesRankingLoss
widget:
  - source_sentence: >-
      Congruence of Triangles. Triangles. Maths. CBSE 9. CBSE Content - Final.
      CBSE. 
    sentences:
      - 'Expressing Multiplication Sentences Practice. . '
      - 'Prove R-H-S criteria for congruence of triangle. . '
      - >-
        DNA. . &nbsp; I'm sure many of y'all have already heard of the molecule
        DNA, and it stands for deoxyribonucleic acid. I wrote it out ahead of
        time to spare you the pain of watching me spell this in real time. But
        it is-- and I think you already have an idea. This is the basic unit of
        heredity, or it's what codes all of our genetic information. And what I
        want to do in this video-- because I think that's kind of common
        knowledge. That's popular knowledge that, oh, everything that makes my
        hair black or my eyes blue or whatever, that's all somehow encoded in
        our DNA. But what I want to do in this video is give you an idea of how
        something like DNA, a molecule, can actually code for what we are. How
        does the information, one, get stored in this type of a molecule, then
        how does that actually turn into the proteins that make up our enzymes
        and our organs and our brain cells and everything else that really make
        us us? So this is a computer graphics representation of DNA, and I'm
        sure many of y'all have heard of the double helix. &nbsp; And that's in
        reference to the structure that DNA takes. And you can see here it's a
        double helix. As you can see here, you have two of these lines, and
        they're intertwined with each other. You see there, that's one of them,
        and then you see another one intertwined like that. And then they're
        connected by-- you can almost view it as like these bridges between the
        two helixes, and they twist around each other. I think you get the idea.
        So the double helix just describes the structure, the shape that DNA
        takes, and it leads to all sorts of interesting repercussions in terms
        of how heredity takes place and how natural selection and variation
        might take place as well. And actually, in the future, I do want to
        actually read with you Watson and Crick's paper on the double helix
        where they essentially talk about their discovery. The best thing about
        that paper, besides the fact that it was probably one of the biggest
        discoveries in the history of mankind, is that the paper is only a page
        and a half long, and it goes to my general view that if you have
        something good to say, it shouldn't take you that long to say it. But
        with that said, let's think a little bit about how this can actually
        generate the proteins and whatever else that make up all of us. So right
        here this is a zoomed-up version of that graphic that I just showed you
        a little bit earlier, and this is each of the helixes. So if this is the
        magenta side, if you unwound this helix-- right now it shows it in its
        wound state, but if I unwind this helix, one side would maybe be this
        magenta side of our helix and then one side is this green side, right?
        And if you twist it up, you get back to this drawing up here. And then
        these bridges that you see in this drawing in the double helix, those
        are these connections right here. These are the bridges. &nbsp; Now,
        what allows us to code information is that the blocks that make up the
        bridges are made of different molecules. And the four different
        molecules that are made up in DNA are adenine-- and it's written here on
        this little chart. I got all of this from Wikipedia, so if you want more
        information I encourage you to go there. Adenine, that's up here. This
        is the molecular structure of adenine. It's connected to a sugar right
        here, ribose. I won't go into a deoxyribose. And then you have your
        phosphate group. But these kind of form the backbone of the DNA: the
        sugar and the phosphate groups. And I'm not going to go into the
        microbiology of it, because that's not important right now to
        understanding just how does this intuitively code for what we are. So
        along the backbone, which is identical, and we'll talk about it. They
        run in different directions. It's called antiparallel, so they label the
        ends. And I'm not going to go into detail there, but the important thing
        are these bases here. So you have adenine, and adenine pairs with
        thymine, and you see that up here. If you have an adenine molecule here,
        an adenine base here, it'll pair with thymine, and this is called the
        base pair. Adenine and thymine pair with each other. If you have
        thymine, it's going to pair with adenine. And then you have guanine and
        it pairs with cytosine. &nbsp; And the names of these, you should know
        these names, just because they are almost-- well, if you ever enter any
        discussion about DNA and base pairs, this is expected knowledge. But the
        names of the molecules and how they're structured, not important just
        yet. But what's important is the fact that there are four of them and
        that they essentially code information. So you can view one of these
        strands in kind of a simplified way. You can just view it as a strand
        of-- so this one, if it has an adenine and then it has a cytosine, then
        it has a guanine. That's a guanine. They did it in purple. And then it
        has a-- oh, no, it has a thymine, not a guanine. So it has a thymine in
        purple, and then in blue, it has a guanine. So this strand right here
        codes ACTG. And if you were to code the opposite side of the strand, you
        could immediately-- I don't even have to look here. I can look at this
        side and say, OK, adenine will pair with thymine, cytosine pairs with
        guanine, thymine pairs with adenine, and guanine pairs with cytosine. So
        they're complementary strands. So if you think about it, they're really
        coding the same thing. If you have one of them, you have all of the
        information for the other. Now, in our DNA, in a human's DNA, you might
        say, hey, Sal, how do I go from these little chains of these molecules?
        How does that turn into me? How does that turn into this complex
        organism? And the simple answer is, well, the human genome has three
        billion of these base pairs. &nbsp; And that's actually just in half of
        your chromosomes. And I'll tell you, maybe in this video or a future
        video, why we only consider half of your chromosomes, and that's because
        essentially you have a pair of every chromosome. I'll talk in more
        detail about that. And this number, to some people, they might say, it
        only takes three billion base pairs to describe who I am? And some
        people would say, wow, it takes three billion base pairs to describe who
        I am. I never thought I was that complex. So depending on your point of
        view, this is either a large or small number. But when you take these
        three billion base pairs, you're actually encoding all of the
        information that it takes to make in this case a human being. And
        actually it turns out a lot of primates don't have that many different
        base pairs than human beings. The amazing thing is even things like
        roundworms and fruit flies also number in a surprisingly large fraction
        of the base pairs of a human being. Maybe I'll do another video where I
        go into comparative biology. But how do these base pairs actually lead
        to proteins? I mean, it's fair enough. That's information. It's like you
        can view these as ones and zeroes in some type of computer language, but
        really they're not just ones and zeroes, because they can take on four
        different values. They can take on an A, a T, a C or a G, so you could
        think of them as zero, ones, twos and threes, but I won't go into that
        whole aspect of it just now. So how does that actually code information?
        So DNA when it actually transcribes something-- the process is called
        transcription, and I'm going to do a pretty gross simplification of it,
        but I think it'll give you the gist of how it codes for proteins. So
        what happens when transcription happens is that these two strands split
        up, and one of the strands-- let me just take one of them. Let's say it
        looks like this. I'll do it all in one color. Let's say it's just
        ATGGACG-- I'm just making up stuff-- TA. Let's say that that's the
        strand that got split up. And what happens is it transcribes-- and I
        won't say itself. There's a whole bunch of enzymes and proteins and a
        whole bunch of chemical reactions that have to happen, but this DNA
        essentially transcribes a complementary mRNA. And I'll introduce RNA.
        &nbsp; It's essentially the exact same thing as-- well, the word is
        ribonucleic acid, so it's literally-- you get rid of the deoxy, so you
        can kind of say it's got its oxy, and it's ribonucleic acid, but it's
        very similar to DNA. It codes in the exact same way. The only difference
        between RNA, instead of a thymine, it has something called a uracil. So
        every place where you would have expected a thymine, you would have
        expected a T, you'll now see a U. So, for example, if this is the DNA
        strand, then an RNA, an mRNA, in a messenger RNA strand, will be built
        complementary to this. So it'll be built-- let's see. With A, you'd
        normally have thymine when you're talking DNA, but now we're talking
        RNA, so it'll be a uracil, then an adenine, cytosine, cytosine, uracil,
        then we got a guanine, a cytosine, an adenine, and then we'll have a
        uracil. So this is the mRNA strand here. And all of this is occurring
        inside the nucleus of your cells. And we'll do a whole series of videos
        in the future about the structure of our cells, but I think most of us
        know that our cells-- and I'll talk more about eukaryotic and
        prokaryotic organisms in the future, but most complex organisms, they
        have a cell nucleus where we have all of our chromosomes that contain
        all of our DNA. And so this mRNA then detaches itself from the DNA that
        it was transcribed from, and then it leaves the nucleus, and it goes to
        these structures called ribosomes. I'm oversimplifying it a little bit,
        but at the ribosomes, this mRNA is translated into proteins. So let me
        do that. So let's say this is the mRNA. It was transcribed from that
        DNA, so let me get rid of that DNA now. I got rid of the DNA. This is
        the mRNA that we were able to transcribe from that DNA, and they have
        these other things called tRNA or transfer RNA. And what these are-- and
        this is the really interesting part. So you may or may not know that
        pretty much everything we are is made up of proteins. And these
        proteins, the building blocks of proteins are amino acids. And for those
        of you who like to lift weights, I'm sure you've seen ads for amino acid
        supplements and things of the like. And the reason why they talk about
        amino acids is because those are the building blocks of proteins. My son
        actually has an allergy to milk protein, so we had to get him a formula
        that was just pure amino acids, just all of the milk proteins broken
        down. So if you look at a protein, it's actually a chain of these amino
        acids and usually a fairly long chain. We'll look at some protein
        structures in the very near future, just to give you an idea of things.
        It's a very long chain of these amino acids, and there are actually 20
        different amino acids. Twenty different amino acids are pretty much the
        structure of all of our proteins. Let me write that. &nbsp; So a very
        obvious question is how can these things code for 20 different amino
        acids? I can only have four different things in this little bucket right
        here. And then you just have to go back to your combinatorics, or if you
        can't go back to it to watch the playlist on probability and
        combinatorics, and say, OK, there's only four ways that I can have for
        each of these bases. There's only four different bases that I can have
        here, either an adenine guanine, cytosine or, depending on whether we're
        talking about DNA or RNA, a uracil or a thymine. But how can we increase
        the combinations? Well, if we include two of them, if we include two
        bases, then how many combinations can we have? Well, we have four
        possibilities here, then we'd have four possibilities here, so we'd have
        16 possibilities. But that's still not enough. That's still not enough
        to code for one of 20 amino acids to say, hey, this is going to code for
        amino acid number five, and we'll talk more about their actual names. So
        what do we have to do? Well, we have to use three of them. So three of
        them, there's actually four times four times four possibilities here, so
        they could code for 64 different things. They could take on 64 different
        combinations or permutations, this UAC right here. So if we have three
        of these bases, we can actually code for an amino acid. Actually, it's
        overkill, because we can actually have 64 combinations here, and there
        are only 20 amino acids, so we can even have redundant combinations code
        for different amino acids. For example, we might say that, and this
        isn't the actual code, but maybe UAC, and I should look these up. This
        codes for amino acid number 1. And if it was AAU, then this codes for
        amino acid number 2. And if I have-- I mean, I think you get the idea.
        If I have GGG, this codes for amino acid number 10. And what happens is
        when this messenger RNA leaves the nucleus, it goes to the ribosomes,
        and at the ribosomes-- we're going to look at that diagram in a few
        seconds-- but at the ribosomes-- let me take my same mRNA molecule. And
        they're much longer than what I'm showing here. This is just a fraction
        of an mRNA molecule. &nbsp; So I'll take my mRNA molecule, and what they
        do is they essentially act as a template for tRNA molecules. And tRNA
        molecules are these molecules that are attached to the-- they're almost
        like the trucks for the amino acids. So let's say I have some amino acid
        right here, and then I have another amino acid that's right here like
        that, and then I have another amino acid that's like that. They'll be
        attached to tRNA molecules. So let's say that this tRNA molecule has on
        it-- so this amino acid is attached to a tRNA molecule that has the code
        on it A-- let me do it in a darker color. It has the code AUG. &nbsp;
        This one right here has the code-- let me pick another one. Let's say it
        has GGAC. &nbsp; So what's going to happen? When you're in the ribosome,
        and it's a complex situation, but actually what's happening isn't too
        fancy. This tRNA, it wants to bond to this part of the mRNA. Why?
        Because adenine bonds with uracil, uracil bonds with adenine, and
        guanine bonds with cyotsine, so it'll pull up right here. It'll pull up
        right next to this thing, and actually, I should probably-- well, I
        don't know if I can rotate it. But it'll just pull up right here and
        attach to this mRNA molecule. And this right here is tRNA. &nbsp; This
        is mRNA. And the names don't matter. I really just want to give you the
        big picture idea of how the proteins are actually formed. And this is an
        amino acid. I don't know, let's call it amino acid 1, amino acid 5,
        amino acid 20. This guy, he's going to pull up right here. The guanine
        is attracted to the cytosine, and if you watch the chemistry videos,
        these are actually hydrogen bonds that form the base pairs. Adenine,
        wants to pull up to uracil, cytosine to guanine, and so on and so forth.
        And so once all of these guys have pulled up-- let me do that. So once
        you've pulled up, let's say that this is-- I could do it up here. This
        is my mRNA molecule. I'm not going to draw the specifics right there. My
        little tRNA's pull up, pull up next to it, and they each hold a payload,
        right? So this first one holds this payload right here of this amino
        acid. The second one holds this payload of this amino acid and so forth
        and so on. And so it might keep going, and there's another green amino
        acid here. They really don't have those colors, but I'm just-- just for
        the sake of simplicity like that. And then the amino acids bond to each
        other when they're held like that close to each other. This doesn't
        happen all by itself. The ribosome serves a purpose, and there are
        enzymes that facilitate this process, but once these guys bond together,
        the tRNA detaches, and you have this chain of amino acids. And then the
        chain of amino acids starts to bend around so they have all of these--
        and it's actually a fascinating-- I mean, people spend their lives
        studying how proteins fold, and that's actually where they get most of
        their structural properties. It's not just the chain of the amino acids,
        but what's more important is how these amino acids actually fold. So
        once you fold them, they form these really ultracomplex patterns based
        on what amino acid is attracted to what other amino acid in these very
        intricate three-dimensional shapes. And what I took here from Wikipedia
        is these are some amino acids. And just to be able to relate this to the
        DNA, this right here is insulin. It's key in our ability to process
        glucose in our body. So this right here is insulin. It's a hormone. So
        sometimes you hear people talk about your immune system. Sometimes you
        hear people talking about your endocrine system and hormones, sometimes
        your digestive system. This is hemoglobin, what essentially transports
        our oxygen in our blood. But all of these things are proteins, and all
        these little, little folds you see, these are all little amino-- I mean,
        they're just little dots of amino acids. Some of these are multiple
        chains of amino acids kind of fitting together like a big puzzle, but
        some of them or just single chains of amino acids. For insulin right
        here, this is 50 amino acids. And then once the chain forms, it all
        bundles together and forms this little blob like you see, but the shape
        of that blob is super important for insulin being able to perform the
        function that it needs to perform in our systems. But this right here is
        approximately 50-- I forgot the exact number-- amino acids. &nbsp; This
        right here, this immunoglobulin G, which is part of our immune system,
        this is roughly 1,500 amino acids. So how much DNA or how many base
        pairs had to code for this? Well, three times as much, right? Because
        you have to have three base pairs that code for one amino acid, and
        actually, three base pairs, this is called a codon, because it codes for
        amino acids. So three base pairs make a codon. So if you have 50 amino
        acids that make up insulin, that means you're going to have to have 50
        codons, which means you have to have 150 bases or 150 of these A's and
        G's and T's. If you have 1,500 amino acids, that means you're going to
        have to have 1,500 codons, which means you're going to have roughly
        4,500 of these base pairs that code for it. Now, there are some notions
        that get confused a lot, so I went to kind of the smallest level of our
        DNA right here, and this is the level at which-- well, this is RNA that
        I'm pointing to right there, but this is the smallest level of DNA, and
        that's the level at which the information is actually coded. But how
        does that relate to things like genes and chromosomes and things that
        you might talk about in other contexts? &nbsp; So let's say the 150 base
        pairs that coded for insulin, these make up a gene. &nbsp; And these
        4,500 base pairs make up another gene. Now, all of the genes don't make
        proteins, but all of the proteins are made by genes. So let's say I have
        just a bunch of-- I'll just make another A, G, and it goes down, down,
        down, and you have a T and then a C and a C, and let's say I have 4,500
        of these. These could code for a protein. These could code for protein,
        or they could have all of these other kind of regulatory functions
        telling what other parts of the DNA should and should not be coded and
        how the DNA behaves, so it becomes super, super complex. But this kind
        of section of our DNA, this is what we refer to as a gene, and a gene
        can have anywhere from a couple of hundreds of these base pairs or these
        bases to several thousand of these base pairs. Now, a gene is that part
        of our chromosome that codes for a particular protein or serves a
        certain function. Now, there are different versions of genes. &nbsp;
        It's a gross oversimplification, but let me say this is the gene for
        insulin. &nbsp; Now, there might be slight variations in how insulin can
        be coded for, and I'm kind of going out of my domain right here, because
        I don't know if that's true. And maybe I shouldn't just speak
        specifically about insulin, but it's coding for some protein, but
        there's maybe multiple different ways that that protein can be coded.
        Maybe instead of a T here, sometimes there's a C there. It still codes
        for the same protein. It doesn't change it quite enough, but that
        protein acts just a little bit different. It's a slight variant. I'll
        use that word. Now, each variant of this gene is called an allele.
        &nbsp; It's a specific variant of your gene. &nbsp; Now, if you take
        this DNA chain, and this chain over here-- let's see. This is one base
        pair. This might be like one base. This is another base. Maybe this is
        an adenine and then this would be a thymine over here in green. This is
        an adenine and this would be a thymine. If right here this is a guanine,
        then right here would be a cytosine. This would be just a very small
        section. If I were to like zoom out, and let's say we have a big chain
        of DNA where each of these little dots are a base pair that I'm drawing
        here, maybe this section codes for gene 1. And then there's some noise
        or things that we haven't fully understood yet. Now, I want to be clear.
        Just with a simple discussion of DNA, we're already kind of approaching
        the frontiers of what we know and what we don't know, because DNA is
        hugely complex, and there's all of these feedback structures, and
        certain genes tell you to code for other genes and not to code for other
        genes and to code under certain circumstances, hugely complex. So
        there's huge sections of DNA that we still don't understand what exactly
        they do. But then maybe they'll have another section here that codes for
        gene 2. Maybe gene 2 is a little bit longer. Maybe it's 1,000 base
        pairs. But when you take all of these and you turn it into a-- it kind
        of winds in on itself like this. Let me do it. So it'll wind up, winding
        in on itself like this and do all sorts of crazy things. Remember, it
        completely bundles itself up, and then it looks something like that.
        Then you get a chromosome. &nbsp; And just to get an idea of how large a
        chromosome is compared to the actual base pairs, chromosome number one
        in the human genome-- so we have 23 pairs. If you look at it inside of a
        nucleus-- so let's say that's the nucleus. Let's say this is the cell.
        The cell is much bigger than what I'm showing. But we have 23 pairs of
        chromosomes. &nbsp; I won't do all of them. You can actually see
        chromosomes in a not-too-expensive microscope, so we're already getting
        to a scale that we can start to look at. But the largest chromosome,
        which is chromosome number one in the human genome, just to give an idea
        of how much information it's packing, that thing right there has 220
        million base pairs. Sometimes people talk about chromosomes and genetics
        and genes and base pairs interchangeably, but it's very important to
        kind of get an idea of scale. These chromosomes are a super-long strand
        of DNA that's all configured and bundled up, and it contains 220 million
        base pairs. So the actual elements that are coding for the information
        are unbelievably small relative to the chromosome itself. But now that
        we understand a little bit, and actually I want to take a look back at
        this, because this kind of blows my mind, that if you just take those
        little combinations of those amino acids, you can form these very
        intricate, very advanced structures that we're still fully understanding
        how they actually interact with each other and regulate how all of our
        biological processes work. And what's even more amazing is that this
        scheme that I've talked about in this video about DNA to mRNA to tRNA to
        these molecules, this is true for all of life on our planet, so we all
        share this same mechanism. Me and this plant, we share that common root
        that we all have DNA. As different as me and that roach that I might not
        like to be in the same room, we all share that same common root of DNA
        and that all of it codes to proteins in this exact same way, that
        there's this commonality amongst all life. That, to me, is mind blowing.
        Then even more mind blowing is how these very complex shapes are formed
        by the DNA. And this isn't speculation. This is observed behavior. This
        is a fascinating structure right here, but it's just based on 20 amino
        acid-- you can almost view the amino acid as the LEGOS, and you put the
        LEGOS together, and just the chemical interactions form these fairly
        impressive structures right here. So now that we know a little bit about
        DNA and how it codes into protein, we can take a little jump back and
        talk a little bit more about how variation is actually introduced into a
        population. &nbsp;
  - source_sentence: >-
      Explore. Assessments. Cell. Cell Structure and Micro-organisms. Grade 7.
      Science channel. 
    sentences:
      - >-
        Area Builder. Create your own shapes using colorful blocks and explore
        the relationship between perimeter and area. Compare the area and
        perimeter of two shapes side-by-side. Challenge yourself in the game
        screen t. 
      - 'Cells Practice. . '
      - ": Human Actions and the Sixth Mass Extinction. . This is one of the most powerful birds (http://www.ck12.org/biology/Birds) in the world. Could it go extinct?\n\nThe Philippine Eagle, also known as the Monkey-eating Eagle, is among the rarest, largest, and most powerful birds (http://www.ck12.org/biology/Birds) in the world. It is critically endangered, mainly due to massive loss of habitat due to deforestation in most of its range. Killing a Philippine Eagle is punishable under Philippine law by twelve years in jail and heavy fines.\n\nHuman Actions and the\_Sixth Mass Extinction\n\nOver 99 percent of all species that ever lived on Earth have gone extinct. Five mass extinctions (http://www.ck12.org/life-science/Mass-Extinctions-in-Life-Science) are recorded in the fossil record (http://www.ck12.org/biology/The-Fossil-Record). They were caused by major geologic and climatic events. Evidence shows that a sixth mass extinction is occurring now. Unlike previous mass extinctions (http://www.ck12.org/life-science/Mass-Extinctions-in-Life-Science), the sixth extinction is due to human actions.\n\nSome scientists consider the sixth extinction to have begun with early hominids during the Pleistocene. They are blamed for over-killing big mammals such as mammoths. Since then, human actions have had an ever greater impact on other species. The present rate of extinction is between 100 and 100,000 species per year. In 100 years, we could lose more than half of Earth’s remaining species.\n\nCauses of Extinction\n\nThe single biggest cause of extinction today is habitat loss. Agriculture (http://www.ck12.org/chemistry/Agriculture), forestry, mining, and urbanization have disturbed or destroyed more than half of Earth’s land area. In the U.S., for example, more than 99 percent of tall-grass prairies have been lost. Other causes of extinction today include:\n\nExotic species introduced by humans into new habitats. They may carry disease, prey on native species, and disrupt food webs. Often, they can out-compete native species because they lack local predators. An example is described in Figure below (http://www.ck12.org/book/CK-12-Biology-Concepts/section/6.26/#x-ck12-QmlvLTEyLTIzLWJyb3duLXRyZWUtc25ha2U.).\n\nOver-harvesting of fish (http://www.ck12.org/biology/Fish), trees, and other organisms. This threatens their survival and the survival of species that depend on them.\n\nGlobal climate change, largely due to the burning of fossil fuels. This is raising Earth’s air and ocean temperatures. It is also raising sea levels. These changes threaten many species.\n\nPollution, which adds chemicals, heat (http://www.ck12.org/physical-science/Heat-in-Physical-Science), and noise to the environment beyond its capacity to absorb them. This causes widespread harm to organisms.\n\nHuman overpopulation, which is crowding out other species. It also makes all the other causes of extinction worse.\n\nThe brown tree snake is an exotic species that has caused many extinctions on Pacific islands such as Guam.\n\nEffects of Extinction\n\nThe results of a study released in the summer of 2011 have shown that the decline in the numbers of large predators like sharks, lions and wolves is disrupting Earth's ecosystem in all kinds of unusual ways. The study, conducted by scientists from 22 different institutions in six countries, confirmed the sixth mass extinction. The study states that this mass extinction differs from previous ones because it is entirely driven by human activity through changes in land use, climate, pollution, hunting, fishing and poaching. The effects of the loss of these large predators can be seen in the oceans and on land.\n\nFewer cougars in the western US state of Utah led to an explosion of the deer population. The deer ate more vegetation, which altered the path of local streams and lowered overall biodiversity (http://www.ck12.org/biology/Biodiversity).\n\nIn Africa, where lions and leopards are being lost to poachers, there is a surge in the number of olive baboons, who are transferring intestinal parasites to humans living nearby.\n\nIn the oceans, industrial whaling led a change in the diets of killer whales, who eat more sea lions, seals, and otters and have dramatically lowered the population counts of those species.\n\nThe study concludes that the loss of big predators has likely driven many of the pandemics, population collapses and ecosystem shifts the Earth has seen in recent centuries.\n\nDisappearing Frogs\n\nAround the world, frogs are declining at an alarming rate due to threats like pollution, disease, and climate change. Frogs bridge the gap between water (http://www.ck12.org/biology/Water-Advanced) and land habitats, making them the first indicators (http://www.ck12.org/chemistry/Indicators) of ecosystem changes.\n\nNonnative Species\n\nScoop a handful of critters out of the San Francisco Bay and you'll find many organisms from far away shores. Invasive kinds of mussels, fish (http://www.ck12.org/biology/Fish), and more are choking out native species, challenging experts around the state to change the human behavior that brings them here.\n\nHow You Can Help Protect Biodiversity\n\nThere are many steps you can take to help protect biodiversity (http://www.ck12.org/biology/Biodiversity). For example:\n\nConsume wisely. Reduce your consumption wherever possible. Re-use or recycle rather than throw out and buy new. When you do buy new, choose products that are energy (http://www.ck12.org/physics/Energy) efficient and durable.\n\nAvoid plastics. Plastics are made from petroleum and produce toxic waste.\n\nGo organic. Organically grown food is better for your health. It also protects the environment from pesticides and excessive nutrients in fertilizers.\n\nSave energy (http://www.ck12.org/physics/Energy). Unplug electronic equipment and turn off lights when not in use. Take mass transit instead of driving.\n\nLost Salmon\n\nWhy is the salmon population of Northern California so important? Salmon do not only provide food for humans, but also supply necessary nutrients for their ecosystems (http://www.ck12.org/biology/Ecosystems). Because of a sharp decline in their numbers, in part due to human interference, the entire salmon fishing season off California and Oregon was canceled in both 2008 and 2009. The species in the most danger of extinction is the California coho salmon.\n\nSummary\n\nEvidence shows that a sixth mass extinction is occurring. The single biggest cause is habitat loss caused by human actions.\n\nThere are many steps you can take to help protect biodiversity. For example, you can use less energy (http://www.ck12.org/physics/Energy).\n\nReview\n\nHow is human overpopulation related to the sixth mass extinction?\n\nWhy might the brown tree snake or the Philippine Eagle serve as “poster species” for causes of the sixth mass extinction?\n\nDescribe a hypothetical example showing how rising sea levels due to global  (http://www.ck12.org/earth-science/Global-Warming)warming (http://www.ck12.org/earth-science/Global-Warming) might cause extinction.\n\nCreate a poster that conveys simple tips for protecting biodiversity.\n\nResources"
  - source_sentence: >-
      Classifying geometric shapes. Plane figures. 4th grade. Math by grade.
      Khan Academy (English - US curriculum). 
    sentences:
      - >-
        Classifying shapes by lines and angles. Lindsay classifies a shape based
        on hints about its sides and angles.


        . - [Voiceover] Which shape matches all three clues? So here we have
        three clues and we want to see which shape down below matches all three
        of these statements. So let's start with the first clue. The first clue
        says the shape is a quadrilateral, quad meaning four-sided. So looking
        down here at our shapes, let's see which ones match that first clue.
        Shape one has one, two, three, four sides. So it is a quadrilateral.
        Shape two has one, two, three, four sides. So also a quadrilateral.
        Shape three has one, two, three, four, five, six sides. So it is not a
        quadrilateral. It's a six-sided shape or a hexagon. So we can rule that
        one out. It doesn't match clue one so there's no way it can match all
        three clues. And finally shape four has one, two, three, four sides
        again so it is also a quadrilateral. So after clue one, we still have
        three possible answers. This first shape, the second shape, and the
        fourth shape all match clue one, they're all quadrilaterals. Looking at
        clue two, it says our shape has no right angles. Right angles are also
        90 degree angles. Right angles are 90 degree angles and they look
        something like this and we often see them marked with a square in the
        middle because they are sort of like square angles. We can create a
        square from the opening that they form, that these angles form. So this
        is a right angle. Looking now down at our shapes, we can see right away
        on shape one has two right angles. There's a square corner and another
        square corner. So this has right angles, but the shape we're looking for
        has no right angles so we can rule this shape out. Shape two does not
        have any right angles. These are not squared off corners. And same with
        shape four, no right angles. So both of those still match both clues one
        and two. So we have two shapes left. They're both quadrilaterals and
        they have no right angles. And finally our last clue, the shape has four
        sides, we knew that 'cause it was a quadrilateral, and those sides are
        of equal length. That means each of the sides is the same length.
        Looking at this first one that we have left, shape two, it looks like
        these sides on the ends are shorter than the sides going up and down. So
        it looks like they are not equal length. So we can rule this one out.
        But let's be sure this last one works. Here the sides all look like
        they're the same length, but the way we can know for sure that they are
        is these tick marks. Any time you have these marks, it's saying that any
        side that has the same amount of marks is the same length. All four of
        these sides have exactly one tick mark so they are all equal in length.
        So shape four matches all three clues. It is a quadrilateral, there are
        no right angles and it has four sides of equal length. So shape four is
        our answer.
      - 'Resistors in Series. . '
      - >-
        Amoeba in motion. This a video of an Amoeba . Movement of the Amoeba is
        shown. First the colorless ectoplasma moves in front of the pseodopodia,
        followed by the grained entoplasma. The video is done with the phase
        contrast technique. Please have a look at my homepage for more:

        http://www.dr-ralf-wagner.de. 
  - source_sentence: >-
      Electromagnet. Electricity and Magnetism. Physical Science. Science.
      K-12. 
    sentences:
      - 'Determining Unknown Angles in Complex Composite Figures Practice. . '
      - 'Electromagnet. . '
      - >-
        Literal vs figurative language Exercise. . It this an example of literal
        or figurative language?  


        He has lost his marbles.


        - Literal

        - Figurative

        - It could be both.



        Has the word literally been used correctly in this sentence?


        Stars are literally millions of kilometres away.


        - Yes

        - No



        Has the word literally been used correctly in this sentence?  


        I haven't been to a comic book store in literally a million years.


        - Yes

        - No



        Is this an example of literal or figurative language?  


        The old wall is falling apart.


        - Literal

        - Figurative

        - It could be both.



        Is this an example of literal or figurative language?  


        Our debating team is falling apart.


        - Literal

        - Figurative

        - It could be both.



        Is this an example of literal or figurative language?  


        I am feeling blue.


        - Literal

        - Figurative

        - It could be both.



        Is this an example of literal or figurative language?


        The sky is blue.


        - Literal

        - Figurative

        - It could be both.



        What is the danger of writing using only literal language?


        - The language can be dry and boring.

        - Meaning can be lost.

        - Meaning can be exaggerated.

        - There are no dangers of writing in literal language.



        Which of these is most likely to be written using literal language?


        - A recipe

        - A poem

        - A soliloquy

        - A short story



        Which of the following would you not find in literal language?


        - Descriptive words

        - Direct language

        - Exactly what's happening in the story

        - Similes
  - source_sentence: >-
      Determining Unknown Angles in Complex Composite Figures. Triangles.
      Geometry. Grade 4. Elementary Math. Math. K-12. 
    sentences:
      - 'Determining Unknown Angles in Complex Composite Figures. . '
      - 'Area of parallelograms. . '
      - >-
        Initial value & common ratio of exponential functions. Get comfortable
        with the basic ingredients of exponential functions: the

        Initial value and the common ratio.


        . - [Voiceover] So let's think about a function. I'll just give an
        example. Let's say, h of n is equal to one-fourth times two to the n.
        So, first of all, you might notice something interesting here. We have
        the variable, the input into our function. It's in the exponent. And a
        function like this is called an exponential function. So this is an
        exponential. Ex-po-nen-tial. Exponential function, and that's because
        the variable, the input into our function, is sitting in its definition
        of what is the output of that function going to be. The input is in the
        exponent. I could write another exponential function. I could write, f
        of, let's say the input is a variable, t, is equal to is equal to five
        times times three to the t. Once again, this is an exponential function.
        Now there's a couple of interesting things to think about in exponential
        function. In fact, we'll explore many of them, but I'll get a little
        used to the terminology, so one thing that you might see is a notion of
        an initial value. In-i-tial Intitial value. And this is essentially the
        value of the function when the input is zero. So, for in these cases,
        the initial value for the function, h, is going to be, h of zero. And
        when we evaluate that, that's going to be one-fourth times two to the
        zero. Well, two to the zero power, is just one. So it's equal to
        one-fourth. So the initial value, at least in this case, it seems to
        just be that number that sits out here. We have the initial value times
        some number to this exponent. And we'll come up with the name for this
        number. Well let's see if this was true over here for, f of t. So, if we
        look at its intial value, f of zero is going to be five times three to
        the zero power and, the same thing again. Three to the zero is just one.
        Five times one is just five. So the initial value is once again, that.
        So if you have exponential functions of this form, it makes sense. Your
        initial value, well if you put a zero in for the exponent, then the
        number raised to the exponent is just going to be one, and you're just
        going to be left with that thing that you're multiplying by that.
        Hopefully that makes sense, but since you're looking at it, hopefully it
        does make a little bit. Now, you might be saying, well what do we call
        this number? What do we call that number there? Or that number there?
        And that's called the common ratio. The common common ratio. And in my
        brain, we say well why is it called a common ratio? Well, if you thought
        about integer inputs into this, especially sequential integer inputs
        into it, you would see a pattern. For example, h of, let me do this in
        that green color, h of zero is equal to, we already established
        one-fourth. Now, what is h of one going to be equal to? It's going to be
        one-fourth times two to the first power. So it's going to be one-fourth
        times two. What is h of two going to be equal to? Well, it's going to be
        one-fourth times two squared, so it's going to be times two times two.
        Or, we could just view this as this is going to be two times h of one.
        And actually I should have done this when I wrote this one out, but this
        we can write as two times h of zero. So notice, if we were to take the
        ratio between h of two and h of one, it would be two. If we were to take
        the ratio between h of one and h of zero, it would be two. That is the
        common ratio between successive whole number inputs into our function.
        So, h of I could say h of n plus one over h of n is going to be equal to
        is going to be equal to actually I can work it out mathematically.
        One-fourth times two to the n plus one over one-fourth times two to the
        n. That cancels. Two to the n plus one, divided by two to the n is just
        going to be equal to two. That is your common ratio. So for the function
        h. For the function f, our common ratio is three. If we were to go the
        other way around, if someone said, hey, I have some function whose
        initial value, so let's say, I have some function, I'll do this in a new
        color, I have some function, g, and we know that its initial initial
        value is five. And someone were to say its common ratio its common ratio
        is six, what would this exponential function look like? And they're
        telling you this is an exponential function. Well, g of let's say x is
        the input, is going to be equal to our initial value, which is five.
        That's not a negative sign there, Our initial value is five. I'll write
        equals to make that clear. And then times our common ratio to the x
        power. So once again, initial value, right over there, that's the five.
        And then our common ratio is the six, right over there. So hopefully
        that gets you a little bit familiar with some of the parts of an
        exponential function, why they are called what they are called.
pipeline_tag: sentence-similarity
library_name: sentence-transformers
metrics:
  - cosine_accuracy@1
  - cosine_accuracy@3
  - cosine_accuracy@5
  - cosine_accuracy@10
  - cosine_precision@10
  - cosine_precision@50
  - cosine_precision@100
  - cosine_recall@10
  - cosine_recall@50
  - cosine_recall@100
  - cosine_ndcg@10
  - cosine_mrr@10
  - cosine_map@100
model-index:
  - name: SentenceTransformer
    results:
      - task:
          type: information-retrieval
          name: Information Retrieval
        dataset:
          name: eval ir
          type: eval-ir
        metrics:
          - type: cosine_accuracy@1
            value: 0.6326203208556149
            name: Cosine Accuracy@1
          - type: cosine_accuracy@3
            value: 0.7914438502673797
            name: Cosine Accuracy@3
          - type: cosine_accuracy@5
            value: 0.8481283422459893
            name: Cosine Accuracy@5
          - type: cosine_accuracy@10
            value: 0.8967914438502674
            name: Cosine Accuracy@10
          - type: cosine_precision@10
            value: 0.23825311942959004
            name: Cosine Precision@10
          - type: cosine_precision@50
            value: 0.0709126559714795
            name: Cosine Precision@50
          - type: cosine_precision@100
            value: 0.03923529411764706
            name: Cosine Precision@100
          - type: cosine_recall@10
            value: 0.7040714788488945
            name: Cosine Recall@10
          - type: cosine_recall@50
            value: 0.8725457895726481
            name: Cosine Recall@50
          - type: cosine_recall@100
            value: 0.9169531730172458
            name: Cosine Recall@100
          - type: cosine_ndcg@10
            value: 0.652860842686591
            name: Cosine Ndcg@10
          - type: cosine_mrr@10
            value: 0.7233662960133574
            name: Cosine Mrr@10
          - type: cosine_map@100
            value: 0.5971091711102727
            name: Cosine Map@100

SentenceTransformer

This is a sentence-transformers model trained. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

Model Type: Sentence Transformer
Maximum Sequence Length: 128 tokens
Output Dimensionality: 768 dimensions
Similarity Function: Cosine Similarity

Model Sources

Documentation: Sentence Transformers Documentation
Repository: Sentence Transformers on GitHub
Hugging Face: Sentence Transformers on Hugging Face

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 128, 'do_lower_case': False}) with Transformer model: MPNetModel 
  (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
  (2): Normalize()
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("sentence_transformers_model_id")
# Run inference
sentences = [
    'Determining Unknown Angles in Complex Composite Figures. Triangles. Geometry. Grade 4. Elementary Math. Math. K-12. ',
    'Determining Unknown Angles in Complex Composite Figures. . ',
    "Initial value & common ratio of exponential functions. Get comfortable with the basic ingredients of exponential functions: the\nInitial value and the common ratio.\n\n. - [Voiceover] So let's think about a function. I'll just give an example. Let's say, h of n is equal to one-fourth times two to the n. So, first of all, you might notice something interesting here. We have the variable, the input into our function. It's in the exponent. And a function like this is called an exponential function. So this is an exponential. Ex-po-nen-tial. Exponential function, and that's because the variable, the input into our function, is sitting in its definition of what is the output of that function going to be. The input is in the exponent. I could write another exponential function. I could write, f of, let's say the input is a variable, t, is equal to is equal to five times times three to the t. Once again, this is an exponential function. Now there's a couple of interesting things to think about in exponential function. In fact, we'll explore many of them, but I'll get a little used to the terminology, so one thing that you might see is a notion of an initial value. In-i-tial Intitial value. And this is essentially the value of the function when the input is zero. So, for in these cases, the initial value for the function, h, is going to be, h of zero. And when we evaluate that, that's going to be one-fourth times two to the zero. Well, two to the zero power, is just one. So it's equal to one-fourth. So the initial value, at least in this case, it seems to just be that number that sits out here. We have the initial value times some number to this exponent. And we'll come up with the name for this number. Well let's see if this was true over here for, f of t. So, if we look at its intial value, f of zero is going to be five times three to the zero power and, the same thing again. Three to the zero is just one. Five times one is just five. So the initial value is once again, that. So if you have exponential functions of this form, it makes sense. Your initial value, well if you put a zero in for the exponent, then the number raised to the exponent is just going to be one, and you're just going to be left with that thing that you're multiplying by that. Hopefully that makes sense, but since you're looking at it, hopefully it does make a little bit. Now, you might be saying, well what do we call this number? What do we call that number there? Or that number there? And that's called the common ratio. The common common ratio. And in my brain, we say well why is it called a common ratio? Well, if you thought about integer inputs into this, especially sequential integer inputs into it, you would see a pattern. For example, h of, let me do this in that green color, h of zero is equal to, we already established one-fourth. Now, what is h of one going to be equal to? It's going to be one-fourth times two to the first power. So it's going to be one-fourth times two. What is h of two going to be equal to? Well, it's going to be one-fourth times two squared, so it's going to be times two times two. Or, we could just view this as this is going to be two times h of one. And actually I should have done this when I wrote this one out, but this we can write as two times h of zero. So notice, if we were to take the ratio between h of two and h of one, it would be two. If we were to take the ratio between h of one and h of zero, it would be two. That is the common ratio between successive whole number inputs into our function. So, h of I could say h of n plus one over h of n is going to be equal to is going to be equal to actually I can work it out mathematically. One-fourth times two to the n plus one over one-fourth times two to the n. That cancels. Two to the n plus one, divided by two to the n is just going to be equal to two. That is your common ratio. So for the function h. For the function f, our common ratio is three. If we were to go the other way around, if someone said, hey, I have some function whose initial value, so let's say, I have some function, I'll do this in a new color, I have some function, g, and we know that its initial initial value is five. And someone were to say its common ratio its common ratio is six, what would this exponential function look like? And they're telling you this is an exponential function. Well, g of let's say x is the input, is going to be equal to our initial value, which is five. That's not a negative sign there, Our initial value is five. I'll write equals to make that clear. And then times our common ratio to the x power. So once again, initial value, right over there, that's the five. And then our common ratio is the six, right over there. So hopefully that gets you a little bit familiar with some of the parts of an exponential function, why they are called what they are called.",
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 768]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]

Evaluation

Metrics

Information Retrieval

Dataset: eval-ir
Evaluated with InformationRetrievalEvaluator

Metric	Value
cosine_accuracy@1	0.6326
cosine_accuracy@3	0.7914
cosine_accuracy@5	0.8481
cosine_accuracy@10	0.8968
cosine_precision@10	0.2383
cosine_precision@50	0.0709
cosine_precision@100	0.0392
cosine_recall@10	0.7041
cosine_recall@50	0.8725
cosine_recall@100	0.917
cosine_ndcg@10	0.6529
cosine_mrr@10	0.7234
cosine_map@100	0.5971

Training Details

Training Dataset

Unnamed Dataset

Size: 190,175 training samples
Columns: topic and content
Approximate statistics based on the first 1000 samples:
topic content
type string string
details
min: 15 tokens
mean: 41.93 tokens
max: 128 tokens

min: 5 tokens
mean: 62.57 tokens
max: 128 tokens

	topic	content
type	string	string
details	min: 15 tokens mean: 41.93 tokens max: 128 tokens	min: 5 tokens mean: 62.57 tokens max: 128 tokens

Samples:

topic	content
`Triangles and polygons. Space, shape and measurement. Form 1. Malawi Mathematics Syllabus. Learning outcomes: students must be able to solve problems involving angles, triangles and polygons including: types of triangles, calculate the interior and exterior angles of a triangle, different types of polygons, interior angles and sides of a convex polygon, the size and exterior angle of any convex polygon.`	`Regular and Irregular Polygons. .`
`Triangles and polygons. Space, shape and measurement. Form 1. Malawi Mathematics Syllabus. Learning outcomes: students must be able to solve problems involving angles, triangles and polygons including: types of triangles, calculate the interior and exterior angles of a triangle, different types of polygons, interior angles and sides of a convex polygon, the size and exterior angle of any convex polygon.`	Classifying triangles based on its angles. A triangle is a closed figure consisting of three-line segments which are joined end to end. The joined line segments of a triangle form three angles. You can classify triangles according to sides and angles.. Classifying triangles based on its angles Albert Mhango, Mzimba Introduction: A triangle is a closed figure consisting of three-line segments which are joined end to end. The joined line segments of a triangle form three angles. You can classify triangles according to sides and angles. What is an interior angle? An interior angle is an inside of a shape. Explanation: When classifying triangles according to its angles, you look at the sizes of their interior angles. Under this classification, you have the following types of triangles: 1. Acute angled triangle: A triangle in which all interior angles are acute angles. Do you remember the meaning of acute angle? It is an angle which is less than 90°. Figure shows an example of an acute an...
`Triangles and polygons. Space, shape and measurement. Form 1. Malawi Mathematics Syllabus. Learning outcomes: students must be able to solve problems involving angles, triangles and polygons including: types of triangles, calculate the interior and exterior angles of a triangle, different types of polygons, interior angles and sides of a convex polygon, the size and exterior angle of any convex polygon.`	Classifying triangles. Learn to categorize triangles as scalene, isosceles, equilateral, acute, right, or obtuse. . What I want to do in this video is talk about the two main ways that triangles are categorized. The first way is based on whether or not the triangle has equal sides, or at least a few equal sides. Then the other way is based on the measure of the angles of the triangle. So the first categorization right here, and all of these are based on whether or not the triangle has equal sides, is scalene. And a scalene triangle is a triangle where none of the sides are equal. So for example, if I have a triangle like this, where this side has length 3, this side has length 4, and this side has length 5, then this is going to be a scalene triangle. None of the sides have an equal length. Now an isosceles triangle is a triangle where at least two of the sides have equal lengths. So for example, this would be an isosceles triangle. Maybe this has length 3, this has length 3, and this...

Loss: MultipleNegativesRankingLoss with these parameters:

{
    "scale": 20.0,
    "similarity_fct": "cos_sim"
}

Training Hyperparameters

Non-Default Hyperparameters

eval_strategy: steps
per_device_train_batch_size: 128
per_device_eval_batch_size: 128
learning_rate: 2e-05
num_train_epochs: 1
warmup_ratio: 0.05
fp16: True
load_best_model_at_end: True
batch_sampler: no_duplicates

All Hyperparameters

Click to expand

overwrite_output_dir: False
do_predict: False
eval_strategy: steps
prediction_loss_only: True
per_device_train_batch_size: 128
per_device_eval_batch_size: 128
per_gpu_train_batch_size: None
per_gpu_eval_batch_size: None
gradient_accumulation_steps: 1
eval_accumulation_steps: None
torch_empty_cache_steps: None
learning_rate: 2e-05
weight_decay: 0.0
adam_beta1: 0.9
adam_beta2: 0.999
adam_epsilon: 1e-08
max_grad_norm: 1.0
num_train_epochs: 1
max_steps: -1
lr_scheduler_type: linear
lr_scheduler_kwargs: {}
warmup_ratio: 0.05
warmup_steps: 0
log_level: passive
log_level_replica: warning
log_on_each_node: True
logging_nan_inf_filter: True
save_safetensors: True
save_on_each_node: False
save_only_model: False
restore_callback_states_from_checkpoint: False
no_cuda: False
use_cpu: False
use_mps_device: False
seed: 42
data_seed: None
jit_mode_eval: False
use_ipex: False
bf16: False
fp16: True
fp16_opt_level: O1
half_precision_backend: auto
bf16_full_eval: False
fp16_full_eval: False
tf32: None
local_rank: 0
ddp_backend: None
tpu_num_cores: None
tpu_metrics_debug: False
debug: []
dataloader_drop_last: False
dataloader_num_workers: 0
dataloader_prefetch_factor: None
past_index: -1
disable_tqdm: False
remove_unused_columns: True
label_names: None
load_best_model_at_end: True
ignore_data_skip: False
fsdp: []
fsdp_min_num_params: 0
fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
fsdp_transformer_layer_cls_to_wrap: None
accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
deepspeed: None
label_smoothing_factor: 0.0
optim: adamw_torch
optim_args: None
adafactor: False
group_by_length: False
length_column_name: length
ddp_find_unused_parameters: None
ddp_bucket_cap_mb: None
ddp_broadcast_buffers: False
dataloader_pin_memory: True
dataloader_persistent_workers: False
skip_memory_metrics: True
use_legacy_prediction_loop: False
push_to_hub: False
resume_from_checkpoint: None
hub_model_id: None
hub_strategy: every_save
hub_private_repo: None
hub_always_push: False
gradient_checkpointing: False
gradient_checkpointing_kwargs: None
include_inputs_for_metrics: False
include_for_metrics: []
eval_do_concat_batches: True
fp16_backend: auto
push_to_hub_model_id: None
push_to_hub_organization: None
mp_parameters:
auto_find_batch_size: False
full_determinism: False
torchdynamo: None
ray_scope: last
ddp_timeout: 1800
torch_compile: False
torch_compile_backend: None
torch_compile_mode: None
include_tokens_per_second: False
include_num_input_tokens_seen: False
neftune_noise_alpha: None
optim_target_modules: None
batch_eval_metrics: False
eval_on_start: False
use_liger_kernel: False
eval_use_gather_object: False
average_tokens_across_devices: False
prompts: None
batch_sampler: no_duplicates
multi_dataset_batch_sampler: proportional

Training Logs

Epoch	Step	Training Loss	eval-ir_cosine_ndcg@10
0.0007	1	0.1782	-
0.1999	297	0.1245	0.6279
0.3997	594	0.1224	0.6423
0.5996	891	0.1168	0.6493
0.7995	1188	0.1179	0.6541
0.9993	1485	0.1227	0.6529

Framework Versions

Python: 3.11.13
Sentence Transformers: 4.1.0
Transformers: 4.52.4
PyTorch: 2.6.0+cu124
Accelerate: 1.7.0
Datasets: 2.14.4
Tokenizers: 0.21.1

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

MultipleNegativesRankingLoss

@misc{henderson2017efficient,
    title={Efficient Natural Language Response Suggestion for Smart Reply},
    author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
    year={2017},
    eprint={1705.00652},
    archivePrefix={arXiv},
    primaryClass={cs.CL}
}