Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up
twinkle-ai 's Collections
🏎️ Formosa-1 Series
💾 Traditional Chinese Datasets
🧠 Traditional Chinese Reasoning Datasets
📋 Eval Logs

📋 Eval Logs

updated 16 days ago

Benchmark log generated with Twinkle Eval, recording the model's outputs for each prompt.

Upvote
3

  • twinkle-ai/gpt-oss-eval-logs-and-scores

    Viewer • Updated 16 days ago • 2.63k • 146 • 1

  • twinkle-ai/llama-4-eval-logs-and-scores

    Viewer • Updated Apr 9 • 750 • 14 • 2
Upvote
3
  • Collection guide
  • Browse collections
Company
TOS Privacy About Jobs
Website
Models Datasets Spaces Pricing Docs