Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up
HenriLD 's Collections
smolrx-135M
Dataset Mix for Pre-Training SLMs

Dataset Mix for Pre-Training SLMs

updated Mar 25
Upvote
-

  • open-thoughts/OpenThoughts-114k

    Viewer • Updated Apr 6 • 228k • 16.7k • 705

  • open-r1/OpenThoughts-114k-math

    Viewer • Updated Jan 30 • 89.1k • 737 • 81

  • HuggingFaceFW/fineweb

    Viewer • Updated Jan 31 • 25B • 393k • 2.17k

  • FreedomIntelligence/medical-o1-reasoning-SFT

    Viewer • Updated Apr 22 • 90.1k • 11.8k • 726

  • AI-MO/NuminaMath-CoT

    Viewer • Updated Nov 25, 2024 • 860k • 2.93k • 450

  • dmariko/init_data

    Viewer • Updated Jul 10, 2024 • 188k • 20

  • HenriLD/FDA_Docs

    Viewer • Updated Feb 12 • 30.4k • 10

  • ChayanM/MIMIC-Impression-Dataset

    Viewer • Updated Apr 28, 2024 • 292k • 33 • 2

  • allenai/cord19

    Updated Nov 3, 2022 • 382 • 5

  • MedRAG/pubmed

    Viewer • Updated Feb 27, 2024 • 2.21M • 2.91k • 74

  • EleutherAI/SmolLM2-135M-10B

    Viewer • Updated Apr 15 • 10.1M • 1.41k
Upvote
-
  • Collection guide
  • Browse collections
Company
TOS Privacy About Jobs
Website
Models Datasets Spaces Pricing Docs