Post
🔥 Community and Data Quality Are More For Alignment
A recipe to replicate SPIN (Self-Play Fine Tuning) with 30x less data:
🗣️ 50K samples vs 1.8K prompts curated by the 350+ amazing DIBT contributors.
⚗️ Distillation of Mistral Large instead of OpenAI
🙌 Open data & code with ⚗️distilabel
SPIN Paper:
Self-Play Fine-Tuning Converts Weak Language Models to Strong Language Models (2401.01335)
SPIN DIBT Collection with datasets and models:
argilla/dibt-prompt-collective-spin-65ef59062518776024395fc3
Repo:
https://github.com/argilla-io/distilabel-spin-dibt
Joint work with the amazing DIBT community 👇
@aashish1904 , @flozi00 , @sayhan , @munish0838 , @0-hero , @dvilasuero , @eren23 , @davanstrien , @ahnz , @BlackKakapo , @kitano-o , @mmhamdy , @sdiazlor , @Stopwolf , @gabrielmbmb , @tculler91 , @plaguss , @ignacioct , @Hugi-R , @davidberenstein1957 , @Korla , @alvarobartt , @Hugs4Llamas , @Sumandora , @nataliaElv , @jfcalvo , @Averill , @steventrouble , @vasilis , @aeros93 , @kayyshf , @thomasgauthier , @jeromebas , @Ameeeee , @ayoubelmhamdi , @TuringsSolutions , @efels , @Haleyok , @abrazador , @emessy , @Nindaleth , @burtenshaw , @vicgalle , @CortexPE , @casey-martin , @Leire-aguirre-eguiluz , @mrfakename , @Portias600kNeurons , @nathaliepett , @Filippo
A recipe to replicate SPIN (Self-Play Fine Tuning) with 30x less data:
🗣️ 50K samples vs 1.8K prompts curated by the 350+ amazing DIBT contributors.
⚗️ Distillation of Mistral Large instead of OpenAI
🙌 Open data & code with ⚗️distilabel
SPIN Paper:
Self-Play Fine-Tuning Converts Weak Language Models to Strong Language Models (2401.01335)
SPIN DIBT Collection with datasets and models:
argilla/dibt-prompt-collective-spin-65ef59062518776024395fc3
Repo:
https://github.com/argilla-io/distilabel-spin-dibt
Joint work with the amazing DIBT community 👇
@aashish1904 , @flozi00 , @sayhan , @munish0838 , @0-hero , @dvilasuero , @eren23 , @davanstrien , @ahnz , @BlackKakapo , @kitano-o , @mmhamdy , @sdiazlor , @Stopwolf , @gabrielmbmb , @tculler91 , @plaguss , @ignacioct , @Hugi-R , @davidberenstein1957 , @Korla , @alvarobartt , @Hugs4Llamas , @Sumandora , @nataliaElv , @jfcalvo , @Averill , @steventrouble , @vasilis , @aeros93 , @kayyshf , @thomasgauthier , @jeromebas , @Ameeeee , @ayoubelmhamdi , @TuringsSolutions , @efels , @Haleyok , @abrazador , @emessy , @Nindaleth , @burtenshaw , @vicgalle , @CortexPE , @casey-martin , @Leire-aguirre-eguiluz , @mrfakename , @Portias600kNeurons , @nathaliepett , @Filippo