bwang0911's picture
Add new SentenceTransformer model
390a981 verified
metadata
tags:
  - sentence-transformers
  - sentence-similarity
  - feature-extraction
  - generated_from_trainer
  - dataset_size:44978
  - loss:ReasoningGuidedRankingLoss
base_model: google-bert/bert-base-uncased
widget:
  - source_sentence: Severe weather rips through Alabama university, takes aim at Southeast
    sentences:
      - >-
        The second text provides a detailed elaboration of the first text. It
        expands on the initial statement about severe weather in Alabama,
        providing specific details about the damage at Jacksonville State
        University, the impact on the surrounding areas, and the broader effects
        of the storm.
      - >-
        The labor movement has been living in the shadow of a national assault
        on public-sector collective bargaining for a while now. We’ve talked a
        lot about Harris v. Quinn, how labor dodged a bullet with that case, and
        dodged another with the death of Scalia before the Friedrichs case could
        be decided. But Janus v. American Federation of State, County, and
        Municipal Employees, Council 31 is likely to be the case labor has been
        dreading, and we break it down for you today with Andy Stettner of the
        Century Foundation.

        We also look at Uber’s failures in London and neoliberalism’s failures
        in France, a union drive at the Los Angeles Times and a labor solidarity
        mission to Puerto Rico post-hurricanes. For Argh, we consider forced
        labor “rehab” facilities, and how moving left is the solution to the
        rise of the populist right.

        If you think our work is worth supporting as we soldier on through
        Trumplandia, please consider becoming a sustaining member of Belabored
        or donating or subscribing to Dissent. Help keep us going for the next
        136 episodes!
      - >-
        Severe weather that spawned at least one tornado slammed Alabama’s
        Jacksonville State University on Monday night and took aim at the rest
        of the southeast.

        Alabama state troopers said the damage in Jacksonville, Ala. left the
        city looking like a “war zone.” Strong winds downed trees and damaged
        buildings as the National Weather Service confirmed a “damaging and
        possibly large tornado near Jacksonville and Calhoun counties and was
        moving east.

        Jacksonville State University Athletic Director Greg Seitz wrote in a
        tweet that there was significant damage to campus, including to the
        newly renovated Pete Mathews Coliseum.

        "I can confirm we have major roof damage at Pete Mathews Coliseum, but
        The Pete is not completely destroyed," Seitz said in a tweet.

        Tuscaloosa County Sheriff’s Office Lt. Andy Norris said in a tweet that
        troopers called Jacksonville a “war zone.” He said the arena’s roof
        “took major damage.”

        Photos seen on social media showed the extent of the damage Jacksonville
        took.

        Alabama Gov. Kay Ivey confirmed in a statement late Monday there was
        “significant damage” throughout the state, according to WBRC-TV.

        Cities in northern Alabama reported power outages and the NWS in
        Huntsville reported at least three tornadoes in the area.

        The severe weather moved into Georgia late Monday night.

        Flights at Hartsfield Airport in Atlanta were not officially grounded as
        the damaging winds moved into the area. However, the airport warned on
        Twitter that delays were likely.

        Meanwhile, more than 150 people reportedly took cover into a historic
        cave in Cave Springs, Ga.

        The storms knocked out power to at least 15,000 homes and businesses in
        Alabama. Georgia Power was rpeorting more than 26,000 of their customers
        were without power, according to Cobb County News.

        The Associated Press contributed to this report.
  - source_sentence: NCAA Sexual Violence Policy Criticized as Weak
    sentences:
      - >-
        The second text provides details that elaborate on the criticism
        mentioned in the first text. It describes the NCAA's new rules and then
        presents a specific critique, highlighting the perceived weaknesses in
        the policy, such as the lack of strong enforcement and accountability,
        thus supporting the initial claim of weakness.
      - >-
        CHAMPAIGN -- Illinois had one final chance to finish this week on a
        recruiting strong note. After missing out on three Class of 2018
        forwards early in the week, the Illini were still in the running for
        four-star Georgia prospect Landers Nolley.

        Until Friday morning. Nolley, a 6-foot-7 wing who played his sophomore
        season at Curie in Chicago before moving to Georgia, narrowed his
        choices to Georgia and Virginia Tech.

        Nolley's almost final decision left Illinois 0 for 4 on 2018 targets
        this week after Lukas Kisunas (UConn), George Conditt (Iowa State) and
        Colin Castleton (Michigan) all committed elsewhere. That leaves the
        Illini in further pursuit of in-state targets like Morgan Park's Ayo
        Dosunmu, who will start an official visit at Illinois on Oct. 13, and
        Simeon's Talen Horton-Tucker.
      - >-
        The National Collegiate Athletic Association adopted rules last week
        that require key administrators to complete annual training on sexual
        violence prevention, and to certify annually that the institution's
        teams and programs are familiar with policies and processes to prevent
        sexual violence or to deal with incidents that take place. Further, the
        rules require institutions to provide information to athletes on
        institutional policies and procedures.

        A column in The Huffington Post noted that the NCAA rules are largely
        similar to what federal law requires of colleges, and that they don't
        address issues related to athletes found to have assaulted others. What
        the rules lack, the column said, "is enforcement or accountability that
        approaches penalties reaching the [same] level as the purchase of a
        hamburger for a student athlete."
  - source_sentence: >-
      William few Pkwy and Chamblin Rd new traffic signal - WFXG FOX 54 - News
      Now
    sentences:
      - >-
        The second text elaborates on the first by providing details about the
        traffic signal mentioned in the title. It specifies the location
        (William Few Parkway and Chamblin Road) and the schedule for the
        signal's operation, including the dates it will be in flashing and
        normal modes.
      - >-
        Columbia County wants to inform the driving public of a new traffic
        signal installation. It’s located at William Few Parkway and Chamblin
        Road.

        The light is scheduled to go into flashing mode on Friday October 6th,
        2017. The signal will remain in flashing mode for the remainder of this
        week, including the weekend. The signal is scheduled to be placed into
        normal stop and go operation on Tuesday, October 10, 2017.

        Copyright 2017 WFXG. All rights reserved.
      - >-
        NEWPORT BEACH, Calif. (AP) — The Latest on a fatal helicopter crash in
        Southern California (all times local):

        10:07 a.m.

        California authorities have released the name of all three people killed
        when a small helicopter crashed in a Newport Beach neighborhood.

        The Orange County Sheriff's Department says the dead are 60-year-old
        Joseph Anthony Tena of Newport Beach, 45-year-old Kimberly Lynne Watzman
        of Santa Monica and 56-year-old Brian R. Reichelt of Hollywood.

        The crash Wednesday in a neighborhood involved four people in the
        helicopter and a bystander. Newport Beach police spokeswoman Jennifer
        Manzella says all three people killed were in the helicopter.

        There's no information about two people who were injured.

        ___

        11:03 p.m.

        Officials say three people were killed and two more injured when a
        helicopter crashed into a home in a suburban Southern California
        neighborhood.

        Authorities say four people were aboard the Robinson R44 helicopter when
        it went down in Newport Beach on Tuesday afternoon just a few minutes
        after taking off from John Wayne Airport.

        One person who was outside on the ground was involved in the crash,
        though officials did not specify who died and who was injured.

        Neighbor Marian Michaels says she thought it was an earthquake when the
        helicopter slammed into the house.

        Another neighbor, Roger Johnson, says he heard a scream that sounded
        like it was from a horror movie before rushing to the scene to try to
        help.
  - source_sentence: >-
      Former AG, ex-Jordanian PM top contenders for Pak's ICJ ad-hoc judge
      choice: report
    sentences:
      - >-
        The second text elaborates on the first by providing details about the
        contenders for the ad-hoc judge position. It names specific individuals
        (ex-AG and former Jordanian PM) and provides context about the case at
        the ICJ, the nomination process, and the sources of the information. The
        report confirms the information presented in the title.
      - >-
        Image caption The last confirmed sighting of Brian McGowan was in Plean
        on 21 September

        Police searching for a man who has not been seen for more than two weeks
        are asking the public to check outbuildings and gardens for any trace of
        him.

        Brian McGowan, 42, was last seen in the Gillespie Terrace area of Plean,
        near Stirling, at 16:00 on 21 September.

        Investigations have uncovered a "probable" sighting of him in the
        Gallamuir Drive area at 01:30 the following day.

        Police said that since then he has not returned home or contacted
        anyone.

        Insp Donna Bryans said: "Brian has now been missing for two weeks and it
        is vital that we find him.

        "I would like to thank the local community who have come out to search
        for Brian and helped with our investigations so far.

        "I would ask residents and visitors to Plean, as well as visitors to
        Plean Country Park, to be vigilant and report any sighting of anyone
        seen matching Brian's description."

        Insp Bryans said a search of gardens and outbuildings in the area could
        help officers discover Mr McGowan's whereabouts.

        He is described as 5ft 10 tall, of slim build with short dark hair. He
        had blue eyes and tattoos on his fingers and speaks with a local accent.

        When last seen he was wearing a black baseball cap, a black G-Star
        jacket, grey Armani jumper, grey Adidas tracksuit bottoms with black
        stripes on the sides and black and grey Adidas Y3 trainers.
      - >-
        ISLAMABAD: The Pakistan government has begun consultations over the
        nomination of an ad-hoc judge for the Kulbhushan Jadhav case being heard
        at the International Court of Justice with an ex-attorney general and a
        former Jordanian premier emerging as the top contenders, a media report
        said today. India had moved the Hague-based International Court of
        Justice (ICJ) against Jadhav's death penalty handed down by a Pakistani
        military court. The ICJ had on May 18 restrained Pakistan from executing
        the death sentence.Pakistan government's functionaries have started
        consultations for the nomination of an ad-hoc judge, Express Tribune
        reported, citing sources.During the tenure of ousted prime minister
        Nawaz Sharif , former Supreme Court judge Khalilur Rehman Ramday was
        approached, but he declined the nomination, the report said.Sources were
        quoted by the daily as saying that the Attorney General for Pakistan's
        (AGP) office has recommended the names of senior lawyer Makhdoom Ali
        Khan and former Jordanian prime minister Awn Shawkat Al-Khasawneh to the
        Prime Minister's Office for the nomination of one name as an ad-hoc
        judge.Khasawneh served as an ICJ judge for over a decade, while Khan, a
        former Attorney General who is seen as the favourite for the job, also
        has experience in international arbitration cases, having represented
        eight different countries in international courts.The nomination of the
        ad-hoc judge will be finalised after getting inputs from the Foreign
        Office and the military establishment, the sources said, adding that
        earlier, government functionaries had also considered the name of former
        chief justice of Pakistan Tassaduq Hussain Jillani.An official was
        quoted as saying that the name of the ad- hoc judge will be finalised
        next month, soon after the Indian side files its documents.Meanwhile,
        Pakistan Bar Council (PBC) representative Raheel Kamran Sheikh has
        called upon the government to seek Parliament's approval on the
        appointment of the ad-hoc judge.Only one person has previously been
        appointed as ICJ judge in Pakistan's history -- former foreign minister
        Zafarullah Khan, who was appointed in 1954 and later became the
        president of the court.Yaqub Ali Khan and Sharifuddin Pirzada both
        served as ad-hoc judges, as did Zafarullah.
  - source_sentence: Energy advocates call for new commitment to renewable growth
    sentences:
      - >-
        The second text elaborates on the first by providing details about the
        specific context of the energy advocates' call for renewable growth. It
        identifies the advocates (CFE, VoteSolar, Environment Connecticut), the
        specific renewable energy program (community solar), and the reasons for
        their call, including program delays and design flaws.
      - >-
        The piece below was submitted by CFE, VoteSolar, and Environment
        Connecticut in response to the latest delay in the shared solar pilot
        program.

        Solar and environmental advocates are calling for a new community solar
        program in Connecticut that will expand solar access, energy choices and
        consumer savings for families, municipalities, and businesses statewide.
        The demand follows today’s Department of Energy and Environmental
        Protection (DEEP) technical hearing where attendees reviewed the state’s
        current Shared Clean Energy Facilities pilot program. The pilot has
        stalled several times over the last two years, most recently following
        DEEP’s decision to scrap all the proposals they have received and issue
        a new request for projects. DEEP heard from many advocates and
        developers at the hearing who are frustrated with this latest delay and
        skeptical about the long term success of the pilot.

        The current pilot program was meant to expand solar access to
        Connecticut energy customers who can’t put solar on their own roof, but
        it contained flaws that have prevented any development to date. As set
        out in the legislation, the program has several poor design elements and
        a goal too small to draw significant private sector interest. Below are
        statements from stakeholders in Connecticut’s clean energy economy:

        “For years, Connecticut has missed out on the opportunity to bring solar
        energy choices to all consumers and more clean energy jobs to the
        state,” said Sean Garren, Northeast Regional director for Vote Solar.
        “Connecticut’s lackluster community solar program hasn’t unlocked the
        benefits of solar access for a single resident to date due to poor
        design and a lack of ambition at the scale needed, brought about by the
        electric utilities’ intervention. We’re calling on the legislature to
        catch up to the rest of New England  and the nation  with a smart,
        well-structured community solar program designed to serve consumers
        statewide.”

        “Two years of foot dragging and refusal by the Department of Energy and
        Environmental Protection to follow the law and implement a community
        solar program is preventing tens of thousands of Connecticut families
        from gaining access to clean, affordable, secure solar power,” said
        Chris Phelps, State Director for Environment Connecticut. “Community
        solar is helping other states accelerate solar growth, create jobs, and
        cut pollution. Connecticut policy makers should take action now to
        create a bold community solar program.”

        “Shared solar programs have been sweeping the nation for the last
        decade, but Connecticut has been left in the shade  losing out on
        healthier air, investment dollars, and green jobs that would accompany a
        full-scale, statewide shared solar program,” said Claire Coleman,
        Climate and Energy Attorney for Connecticut Fund for the Environment.
        “DEEP’s decision to start over with the already overly-restrictive
        shared solar pilot puts Connecticut further in the dark. Our climate and
        economy cannot wait any longer. Connecticut’s leaders must move quickly
        to ramp up in-state renewables through a full-scale shared solar program
        if Connecticut is going to have any chance of meeting its obligations
        under the Global Warming Solutions Act to reduce greenhouse gas
        emissions.”

        Vote Solar is a nonprofit organization working to foster economic
        development and energy independence by bringing solar energy to the
        mainstream nationwide. Learn more at votesolar.org.
      - >-
        BEIJING: China will waive income tax for three years for foreign
        investors trading the country’s new crude futures contract, the Ministry
        of Finance said on Tuesday, in a bid to attract overseas capital for the
        much anticipated launch.

        The start of trading on Monday will mark the culmination of a years-long
        push by China to create Asia’s first oil futures benchmark, and is aimed
        at giving the world’s biggest oil importer more clout in pricing crude
        sold to Asia.

        It will potentially give the Shanghai International Energy Exchange,
        which will operate the new contract, a share of the trillions of dollars
        each year in oil futures trading.

        The finance ministry said foreign brokers will be exempted from paying
        income tax on commissions they earn from dealing in the new Shanghai
        crude futures.

        The tax exemption could help encourage foreign players to engage with
        the new contract, despite concerns about issues such as foreign exchange
        conversion and potential capital curbs.

        The number of foreign investors seeking to open non-resident accounts to
        allow trading has so far been below expectations, a source at CITIC, one
        of eight banks that is handling margin deposits for foreign investors,
        said. The source declined to be named as he is not authorized to talk
        with media.

        The oil market is closely watching the liquidity of the contract, as
        institutional investors and brokers expect trading volumes and open
        interest to be relatively small compared with China’s iron ore, copper
        and steel futures contracts.

        China in recent days has provided more details on the contract,
        including margins, trading limits and transaction fees, and has approved
        the use of six bonded storage warehouses.
datasets:
  - bwang0911/reasoning_pairs_filtered_w_reason_ccnews
pipeline_tag: sentence-similarity
library_name: sentence-transformers
metrics:
  - cosine_accuracy@1
  - cosine_accuracy@3
  - cosine_accuracy@5
  - cosine_accuracy@10
  - cosine_precision@1
  - cosine_precision@3
  - cosine_precision@5
  - cosine_precision@10
  - cosine_recall@1
  - cosine_recall@3
  - cosine_recall@5
  - cosine_recall@10
  - cosine_ndcg@10
  - cosine_mrr@10
  - cosine_map@100
model-index:
  - name: SentenceTransformer based on google-bert/bert-base-uncased
    results:
      - task:
          type: information-retrieval
          name: Information Retrieval
        dataset:
          name: mteb/nfcorpus
          type: mteb/nfcorpus
        metrics:
          - type: cosine_accuracy@1
            value: 0.3126934984520124
            name: Cosine Accuracy@1
          - type: cosine_accuracy@3
            value: 0.47678018575851394
            name: Cosine Accuracy@3
          - type: cosine_accuracy@5
            value: 0.5325077399380805
            name: Cosine Accuracy@5
          - type: cosine_accuracy@10
            value: 0.5975232198142415
            name: Cosine Accuracy@10
          - type: cosine_precision@1
            value: 0.3126934984520124
            name: Cosine Precision@1
          - type: cosine_precision@3
            value: 0.2549019607843137
            name: Cosine Precision@3
          - type: cosine_precision@5
            value: 0.20990712074303408
            name: Cosine Precision@5
          - type: cosine_precision@10
            value: 0.16563467492260062
            name: Cosine Precision@10
          - type: cosine_recall@1
            value: 0.03117827434222373
            name: Cosine Recall@1
          - type: cosine_recall@3
            value: 0.05624265377613812
            name: Cosine Recall@3
          - type: cosine_recall@5
            value: 0.06877168791903203
            name: Cosine Recall@5
          - type: cosine_recall@10
            value: 0.09700903168215257
            name: Cosine Recall@10
          - type: cosine_ndcg@10
            value: 0.21852791504742514
            name: Cosine Ndcg@10
          - type: cosine_mrr@10
            value: 0.40163890117450485
            name: Cosine Mrr@10
          - type: cosine_map@100
            value: 0.08949558554054256
            name: Cosine Map@100
      - task:
          type: information-retrieval
          name: Information Retrieval
        dataset:
          name: mteb/trec covid
          type: mteb/trec-covid
        metrics:
          - type: cosine_accuracy@1
            value: 0.62
            name: Cosine Accuracy@1
          - type: cosine_accuracy@3
            value: 0.82
            name: Cosine Accuracy@3
          - type: cosine_accuracy@5
            value: 0.92
            name: Cosine Accuracy@5
          - type: cosine_accuracy@10
            value: 0.94
            name: Cosine Accuracy@10
          - type: cosine_precision@1
            value: 0.62
            name: Cosine Precision@1
          - type: cosine_precision@3
            value: 0.5599999999999999
            name: Cosine Precision@3
          - type: cosine_precision@5
            value: 0.5519999999999999
            name: Cosine Precision@5
          - type: cosine_precision@10
            value: 0.512
            name: Cosine Precision@10
          - type: cosine_recall@1
            value: 0.0005213598128605203
            name: Cosine Recall@1
          - type: cosine_recall@3
            value: 0.0014060584814840184
            name: Cosine Recall@3
          - type: cosine_recall@5
            value: 0.0023515414225962748
            name: Cosine Recall@5
          - type: cosine_recall@10
            value: 0.004357324560804962
            name: Cosine Recall@10
          - type: cosine_ndcg@10
            value: 0.5323227421340048
            name: Cosine Ndcg@10
          - type: cosine_mrr@10
            value: 0.7306666666666668
            name: Cosine Mrr@10
          - type: cosine_map@100
            value: 0.22987991064708832
            name: Cosine Map@100
      - task:
          type: information-retrieval
          name: Information Retrieval
        dataset:
          name: mteb/fiqa
          type: mteb/fiqa
        metrics:
          - type: cosine_accuracy@1
            value: 0.13734567901234568
            name: Cosine Accuracy@1
          - type: cosine_accuracy@3
            value: 0.22839506172839505
            name: Cosine Accuracy@3
          - type: cosine_accuracy@5
            value: 0.2700617283950617
            name: Cosine Accuracy@5
          - type: cosine_accuracy@10
            value: 0.345679012345679
            name: Cosine Accuracy@10
          - type: cosine_precision@1
            value: 0.13734567901234568
            name: Cosine Precision@1
          - type: cosine_precision@3
            value: 0.09310699588477366
            name: Cosine Precision@3
          - type: cosine_precision@5
            value: 0.06944444444444445
            name: Cosine Precision@5
          - type: cosine_precision@10
            value: 0.04645061728395062
            name: Cosine Precision@10
          - type: cosine_recall@1
            value: 0.0697683960415442
            name: Cosine Recall@1
          - type: cosine_recall@3
            value: 0.12649965346724604
            name: Cosine Recall@3
          - type: cosine_recall@5
            value: 0.15659102129009536
            name: Cosine Recall@5
          - type: cosine_recall@10
            value: 0.19997600136489024
            name: Cosine Recall@10
          - type: cosine_ndcg@10
            value: 0.15747637847224993
            name: Cosine Ndcg@10
          - type: cosine_mrr@10
            value: 0.19570105820105824
            name: Cosine Mrr@10
          - type: cosine_map@100
            value: 0.12811920879354669
            name: Cosine Map@100
      - task:
          type: information-retrieval
          name: Information Retrieval
        dataset:
          name: mteb/quora
          type: mteb/quora
        metrics:
          - type: cosine_accuracy@1
            value: 0.7256
            name: Cosine Accuracy@1
          - type: cosine_accuracy@3
            value: 0.8531
            name: Cosine Accuracy@3
          - type: cosine_accuracy@5
            value: 0.8898
            name: Cosine Accuracy@5
          - type: cosine_accuracy@10
            value: 0.9263
            name: Cosine Accuracy@10
          - type: cosine_precision@1
            value: 0.7256
            name: Cosine Precision@1
          - type: cosine_precision@3
            value: 0.33316666666666667
            name: Cosine Precision@3
          - type: cosine_precision@5
            value: 0.21984
            name: Cosine Precision@5
          - type: cosine_precision@10
            value: 0.12146000000000004
            name: Cosine Precision@10
          - type: cosine_recall@1
            value: 0.6303186330948595
            name: Cosine Recall@1
          - type: cosine_recall@3
            value: 0.7900249099696033
            name: Cosine Recall@3
          - type: cosine_recall@5
            value: 0.838050682910748
            name: Cosine Recall@5
          - type: cosine_recall@10
            value: 0.887497633693034
            name: Cosine Recall@10
          - type: cosine_ndcg@10
            value: 0.8013139502721578
            name: Cosine Ndcg@10
          - type: cosine_mrr@10
            value: 0.7959599603174561
            name: Cosine Mrr@10
          - type: cosine_map@100
            value: 0.764750227681921
            name: Cosine Map@100

SentenceTransformer based on google-bert/bert-base-uncased

This is a sentence-transformers model finetuned from google-bert/bert-base-uncased on the reason_unfiltered dataset. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

Model Sources

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 196, 'do_lower_case': False}) with Transformer model: BertModel 
  (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("bwang0911/reasoning-bert-ccnews")
# Run inference
sentences = [
    'Energy advocates call for new commitment to renewable growth',
    'The piece below was submitted by CFE, VoteSolar, and Environment Connecticut in response to the latest delay in the shared solar pilot program.\nSolar and environmental advocates are calling for a new community solar program in Connecticut that will expand solar access, energy choices and consumer savings for families, municipalities, and businesses statewide. The demand follows today’s Department of Energy and Environmental Protection (DEEP) technical hearing where attendees reviewed the state’s current Shared Clean Energy Facilities pilot program. The pilot has stalled several times over the last two years, most recently following DEEP’s decision to scrap all the proposals they have received and issue a new request for projects. DEEP heard from many advocates and developers at the hearing who are frustrated with this latest delay and skeptical about the long term success of the pilot.\nThe current pilot program was meant to expand solar access to Connecticut energy customers who can’t put solar on their own roof, but it contained flaws that have prevented any development to date. As set out in the legislation, the program has several poor design elements and a goal too small to draw significant private sector interest. Below are statements from stakeholders in Connecticut’s clean energy economy:\n“For years, Connecticut has missed out on the opportunity to bring solar energy choices to all consumers and more clean energy jobs to the state,” said Sean Garren, Northeast Regional director for Vote Solar. “Connecticut’s lackluster community solar program hasn’t unlocked the benefits of solar access for a single resident to date due to poor design and a lack of ambition at the scale needed, brought about by the electric utilities’ intervention. We’re calling on the legislature to catch up to the rest of New England — and the nation — with a smart, well-structured community solar program designed to serve consumers statewide.”\n“Two years of foot dragging and refusal by the Department of Energy and Environmental Protection to follow the law and implement a community solar program is preventing tens of thousands of Connecticut families from gaining access to clean, affordable, secure solar power,” said Chris Phelps, State Director for Environment Connecticut. “Community solar is helping other states accelerate solar growth, create jobs, and cut pollution. Connecticut policy makers should take action now to create a bold community solar program.”\n“Shared solar programs have been sweeping the nation for the last decade, but Connecticut has been left in the shade — losing out on healthier air, investment dollars, and green jobs that would accompany a full-scale, statewide shared solar program,” said Claire Coleman, Climate and Energy Attorney for Connecticut Fund for the Environment. “DEEP’s decision to start over with the already overly-restrictive shared solar pilot puts Connecticut further in the dark. Our climate and economy cannot wait any longer. Connecticut’s leaders must move quickly to ramp up in-state renewables through a full-scale shared solar program if Connecticut is going to have any chance of meeting its obligations under the Global Warming Solutions Act to reduce greenhouse gas emissions.”\nVote Solar is a nonprofit organization working to foster economic development and energy independence by bringing solar energy to the mainstream nationwide. Learn more at votesolar.org.',
    "The second text elaborates on the first by providing details about the specific context of the energy advocates' call for renewable growth. It identifies the advocates (CFE, VoteSolar, Environment Connecticut), the specific renewable energy program (community solar), and the reasons for their call, including program delays and design flaws.",
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 768]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]

Evaluation

Metrics

Information Retrieval

Metric mteb/nfcorpus mteb/trec-covid mteb/fiqa mteb/quora
cosine_accuracy@1 0.3127 0.62 0.1373 0.7256
cosine_accuracy@3 0.4768 0.82 0.2284 0.8531
cosine_accuracy@5 0.5325 0.92 0.2701 0.8898
cosine_accuracy@10 0.5975 0.94 0.3457 0.9263
cosine_precision@1 0.3127 0.62 0.1373 0.7256
cosine_precision@3 0.2549 0.56 0.0931 0.3332
cosine_precision@5 0.2099 0.552 0.0694 0.2198
cosine_precision@10 0.1656 0.512 0.0465 0.1215
cosine_recall@1 0.0312 0.0005 0.0698 0.6303
cosine_recall@3 0.0562 0.0014 0.1265 0.79
cosine_recall@5 0.0688 0.0024 0.1566 0.8381
cosine_recall@10 0.097 0.0044 0.2 0.8875
cosine_ndcg@10 0.2185 0.5323 0.1575 0.8013
cosine_mrr@10 0.4016 0.7307 0.1957 0.796
cosine_map@100 0.0895 0.2299 0.1281 0.7648

Training Details

Training Dataset

reason_unfiltered

  • Dataset: reason_unfiltered at 2e4fb05
  • Size: 44,978 training samples
  • Columns: title, body, and reason
  • Approximate statistics based on the first 1000 samples:
    title body reason
    type string string string
    details
    • min: 6 tokens
    • mean: 15.34 tokens
    • max: 42 tokens
    • min: 21 tokens
    • mean: 178.04 tokens
    • max: 196 tokens
    • min: 28 tokens
    • mean: 59.19 tokens
    • max: 88 tokens
  • Samples:
    title body reason
    Fight Leaves Wayne Simmonds Shirtless Reed Saxon/AP Images
    Kevin Bieksa and Wayne Simmonds dropped the gloves just 95 seconds into last night’s 4-3 Ducks shootout win over the Flyers, and Bieksa immediately yanked his opponent’s jersey over his head, to the delight of the crowd and to grins from Simmonds and the officials.
    That’s not supposed to happen. NHL players wear something called a fight strap, which binds the back of the jersey to the pants, preventing the jersey from being pulled off. (Losing a jersey is an advantage in a fight, as it gives the shirtless player’s opponent nothing to grab on to. Sabres enforcer Rob Ray was notorious for losing his gear in a fight, occasionally taking it off himself before clinching.) Any player who engaged in a fight without wearing a fight strap is subject to an automatic game misconduct.
    Advertisement
    Simmonds wasn’t ejected, though; at the one-minute mark of the video above, you can see he did have his fight strap properly attached. It just broke, which happens on occasion.
    The article describes a hockey fight involving Wayne Simmonds, confirming the title's claim. It details the fight, including Simmonds' jersey being pulled off, and explains the rules and context around the incident, directly elaborating on the event suggested by the title.
    Merck CEO Kenneth Frazier ditches Trump over Charlottesville silence Merck CEO Kenneth C. Frazier resigned from the president’s council on manufacturing Monday in direct protest of President Donald Trump’s lack of condemnation of white nationalist actions in Charlottesville, Va. over the weekend.
    In a statement, Frazier, who is African-American, said he believes the country’s strength comes from the diversity of its citizens and that he feels personally compelled to stand up for that diversity and against intolerance.
    “America’s leaders must honor our fundamental values by clearly rejecting expressions of hatred, bigotry and group supremacy, which run counter to the American ideal that all people are created equal,” he wrote. “As CEO of Merck, and as a matter of personal conscience, I feel a responsibility to take a stand against intolerance and extremism.”
    RELATED: At least one death has been confirmed after a car plowed into a crowd of protesters in Charlottesville
    Trump immediately fired back at Frazier on Twitter, saying the Merck CEO now “will have...
    The second text provides a detailed elaboration of the first. It explains the context of Kenneth Frazier's resignation, the reasons behind it (Trump's silence on Charlottesville), and includes Frazier's statement. It also provides additional background information about Frazier and the President's Manufacturing Council.
    Lightning's Braydon Coburn: Joining road trip Coburn (lower body) will travel with the team on its upcoming four-game road trip and is hoping to play at some point in the second half of the trip, Bryan Burns of the Lightning's official site reports.
    The veteran blueliner is yet to play in the month of December, having already missed four games. However, the fact that Coburn is traveling with the team and has been given a chance to play at some point within the next week will be music to the ears of fantasy owners who benefited from Coburn's surprising production -- seven points in 25 games -- earlier in the season. Keep an eye out for updates as the trip progresses.
    The second text elaborates on the first by providing details about Braydon Coburn's situation. It specifies that he will join the team on a road trip and offers context about his injury, recovery timeline, and potential for playing, directly expanding on the initial announcement.
  • Loss: ReasoningGuidedRankingLoss with these parameters:
    {
        "scale": 20.0,
        "similarity_fct": "cos_sim"
    }
    

Training Hyperparameters

Non-Default Hyperparameters

  • eval_strategy: steps
  • per_device_train_batch_size: 256
  • learning_rate: 1e-05
  • warmup_ratio: 0.1
  • fp16: True
  • batch_sampler: no_duplicates

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: steps
  • prediction_loss_only: True
  • per_device_train_batch_size: 256
  • per_device_eval_batch_size: 8
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 1
  • eval_accumulation_steps: None
  • torch_empty_cache_steps: None
  • learning_rate: 1e-05
  • weight_decay: 0.0
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1.0
  • num_train_epochs: 3
  • max_steps: -1
  • lr_scheduler_type: linear
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.1
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 42
  • data_seed: None
  • jit_mode_eval: False
  • use_ipex: False
  • bf16: False
  • fp16: True
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 0
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: False
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • tp_size: 0
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: False
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: None
  • hub_always_push: False
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • include_for_metrics: []
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • dispatch_batches: None
  • split_batches: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: False
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: False
  • use_liger_kernel: False
  • eval_use_gather_object: False
  • average_tokens_across_devices: False
  • prompts: None
  • batch_sampler: no_duplicates
  • multi_dataset_batch_sampler: proportional

Training Logs

Epoch Step Training Loss mteb/nfcorpus_cosine_ndcg@10 mteb/trec-covid_cosine_ndcg@10 mteb/fiqa_cosine_ndcg@10 mteb/quora_cosine_ndcg@10
-1 -1 - 0.0583 0.2174 0.0237 0.6103
0.0568 10 3.443 - - - -
0.1136 20 2.9692 - - - -
0.1705 30 2.1061 - - - -
0.2273 40 1.3012 0.0901 0.3585 0.0642 0.7024
0.2841 50 0.9825 - - - -
0.3409 60 0.7112 - - - -
0.3977 70 0.5853 - - - -
0.4545 80 0.5555 0.1714 0.5160 0.1287 0.7800
0.5114 90 0.4633 - - - -
0.5682 100 0.4216 - - - -
0.625 110 0.3846 - - - -
0.6818 120 0.4017 0.1923 0.5446 0.1417 0.7890
0.7386 130 0.3606 - - - -
0.7955 140 0.3731 - - - -
0.8523 150 0.3451 - - - -
0.9091 160 0.3352 0.2017 0.5343 0.1472 0.7951
0.9659 170 0.3364 - - - -
1.0227 180 0.2606 - - - -
1.0795 190 0.2627 - - - -
1.1364 200 0.2641 0.2065 0.5449 0.1499 0.7963
1.1932 210 0.2448 - - - -
1.25 220 0.2394 - - - -
1.3068 230 0.2433 - - - -
1.3636 240 0.2236 0.2096 0.5432 0.1519 0.7975
1.4205 250 0.221 - - - -
1.4773 260 0.2215 - - - -
1.5341 270 0.2291 - - - -
1.5909 280 0.2433 0.2102 0.5322 0.1543 0.7994
1.6477 290 0.219 - - - -
1.7045 300 0.2207 - - - -
1.7614 310 0.2102 - - - -
1.8182 320 0.2138 0.2163 0.5289 0.1553 0.8006
1.875 330 0.2076 - - - -
1.9318 340 0.2076 - - - -
1.9886 350 0.2066 - - - -
2.0455 360 0.2046 0.2154 0.5339 0.1558 0.8006
2.1023 370 0.1844 - - - -
2.1591 380 0.17 - - - -
2.2159 390 0.1913 - - - -
2.2727 400 0.165 0.2165 0.5339 0.1547 0.8014
2.3295 410 0.1878 - - - -
2.3864 420 0.1841 - - - -
2.4432 430 0.1683 - - - -
2.5 440 0.1767 0.2178 0.5307 0.1565 0.8014
2.5568 450 0.1627 - - - -
2.6136 460 0.161 - - - -
2.6705 470 0.1717 - - - -
2.7273 480 0.1832 0.2169 0.5341 0.1570 0.8012
2.7841 490 0.1673 - - - -
2.8409 500 0.1517 - - - -
2.8977 510 0.1797 - - - -
2.9545 520 0.1862 0.2185 0.5323 0.1575 0.8013

Framework Versions

  • Python: 3.10.12
  • Sentence Transformers: 3.5.0.dev0
  • Transformers: 4.50.0
  • PyTorch: 2.6.0+cu124
  • Accelerate: 1.5.2
  • Datasets: 3.4.1
  • Tokenizers: 0.21.1

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}