--- tags: - bertopic library_name: bertopic pipeline_tag: text-classification --- # xsum_55555_3000_1500_train This is a [BERTopic](https://github.com/MaartenGr/BERTopic) model. BERTopic is a flexible and modular topic modeling framework that allows for the generation of easily interpretable topics from large datasets. ## Usage To use this model, please install BERTopic: ``` pip install -U bertopic ``` You can use the model as follows: ```python from bertopic import BERTopic topic_model = BERTopic.load("KingKazma/xsum_55555_3000_1500_train") topic_model.get_topic_info() ``` ## Topic overview * Number of topics: 54 * Number of training documents: 3000
Click here for an overview of all topics. | Topic ID | Topic Keywords | Topic Frequency | Label | |----------|----------------|-----------------|-------| | -1 | said - people - would - mr - year | 6 | -1_said_people_would_mr | | 0 | party - eu - labour - vote - brexit | 1465 | 0_party_eu_labour_vote | | 1 | trump - mr - president - republican - russia | 129 | 1_trump_mr_president_republican | | 2 | care - health - nhs - patient - hospital | 76 | 2_care_health_nhs_patient | | 3 | syria - syrian - attack - killed - force | 75 | 3_syria_syrian_attack_killed | | 4 | cricket - wicket - england - test - ball | 64 | 4_cricket_wicket_england_test | | 5 | club - league - season - appearance - loan | 59 | 5_club_league_season_appearance | | 6 | wales - rugby - england - game - player | 58 | 6_wales_rugby_england_game | | 7 | film - show - actor - actress - star | 55 | 7_film_show_actor_actress | | 8 | medal - sport - olympic - gold - world | 54 | 8_medal_sport_olympic_gold | | 9 | driving - driver - crash - car - road | 48 | 9_driving_driver_crash_car | | 10 | chelsea - arsenal - city - goal - tottenham | 44 | 10_chelsea_arsenal_city_goal | | 11 | president - mr - petrobras - odebrecht - government | 43 | 11_president_mr_petrobras_odebrecht | | 12 | lifeboat - sea - rnli - ship - boat | 41 | 12_lifeboat_sea_rnli_ship | | 13 | crime - police - child - force - abuse | 37 | 13_crime_police_child_force | | 14 | man - police - men - wearing - arrested | 35 | 14_man_police_men_wearing | | 15 | murray - seed - match - slam - set | 34 | 15_murray_seed_match_slam | | 16 | dog - mountain - animal - avalanche - said | 34 | 16_dog_mountain_animal_avalanche | | 17 | court - sexual - assault - trial - woman | 31 | 17_court_sexual_assault_trial | | 18 | school - education - teacher - academy - pupil | 30 | 18_school_education_teacher_academy | | 19 | fifa - ghana - burkina - african - cup | 29 | 19_fifa_ghana_burkina_african | | 20 | music - album - song - like - im | 28 | 20_music_album_song_like | | 21 | fire - blaze - rescue - said - building | 28 | 21_fire_blaze_rescue_said | | 22 | energy - gas - shale - project - power | 27 | 22_energy_gas_shale_project | | 23 | train - rail - bridge - scotrail - strike | 27 | 23_train_rail_bridge_scotrail | | 24 | growth - rate - oil - market - us | 26 | 24_growth_rate_oil_market | | 25 | town - foul - box - footed - half | 26 | 25_town_foul_box_footed | | 26 | open - round - golf - par - birdie | 26 | 26_open_round_golf_par | | 27 | china - north - chinese - xi - taiwan | 22 | 27_china_north_chinese_xi | | 28 | bond - bank - greek - greece - eurozone | 22 | 28_bond_bank_greek_greece | | 29 | race - lap - second - honda - driver | 21 | 29_race_lap_second_honda | | 30 | president - mr - congolese - africa - african | 21 | 30_president_mr_congolese_africa | | 31 | barcelona - fc - madrid - de - bayern | 19 | 31_barcelona_fc_madrid_de | | 32 | murder - man - postmortem - court - found | 18 | 32_murder_man_postmortem_court | | 33 | welsh - wales - government - assembly - labour | 17 | 33_welsh_wales_government_assembly | | 34 | celtic - game - season - rangers - team | 17 | 34_celtic_game_season_rangers | | 35 | heritage - castle - house - orkney - building | 17 | 35_heritage_castle_house_orkney | | 36 | tax - deficit - debt - economy - financial | 16 | 36_tax_deficit_debt_economy | | 37 | stream - jet - weather - wind - flood | 15 | 37_stream_jet_weather_wind | | 38 | software - security - data - hacker - router | 15 | 38_software_security_data_hacker | | 39 | painting - portrait - art - collection - artist | 14 | 39_painting_portrait_art_collection | | 40 | apple - tablet - hp - firm - android | 14 | 40_apple_tablet_hp_firm | | 41 | robertson - mr - court - knife - murder | 12 | 41_robertson_mr_court_knife | | 42 | unsupported - device - updated - playback - media | 12 | 42_unsupported_device_updated_playback | | 43 | iaaf - doping - athlete - athletics - antidoping | 11 | 43_iaaf_doping_athlete_athletics | | 44 | stolen - theft - burglary - thief - store | 11 | 44_stolen_theft_burglary_thief | | 45 | yn - ar - mae - bod - ei | 11 | 45_yn_ar_mae_bod | | 46 | flight - plane - airport - aircraft - passenger | 11 | 46_flight_plane_airport_aircraft | | 47 | baby - child - infant - mcelhinney - church | 10 | 47_baby_child_infant_mcelhinney | | 48 | party - fillon - mr - socialist - macron | 10 | 48_party_fillon_mr_socialist | | 49 | serbia - scotland - celtic - throwin - kick | 9 | 49_serbia_scotland_celtic_throwin | | 50 | child - childcare - families - mental - nurse | 8 | 50_child_childcare_families_mental | | 51 | turkey - migrant - eu - visa - greece | 6 | 51_turkey_migrant_eu_visa | | 52 | supermarket - store - price - sale - tyrrells | 6 | 52_supermarket_store_price_sale |
## Training hyperparameters * calculate_probabilities: True * language: english * low_memory: False * min_topic_size: 10 * n_gram_range: (1, 1) * nr_topics: None * seed_topic_list: None * top_n_words: 10 * verbose: False ## Framework versions * Numpy: 1.22.4 * HDBSCAN: 0.8.33 * UMAP: 0.5.3 * Pandas: 1.5.3 * Scikit-Learn: 1.2.2 * Sentence-transformers: 2.2.2 * Transformers: 4.31.0 * Numba: 0.57.1 * Plotly: 5.13.1 * Python: 3.10.12