Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up
ankner 's Collections
Base Models With Chat Templates
Hydra Decoding
Oracle 2 Proxy Models
Oracle 2 Proxy Data
Multi Judgement Oversight
Critique-out-Loud Reward Models

Critique-out-Loud Reward Models

updated Sep 5, 2024

Paper: https://arxiv.org/abs/2408.11791 | Code: https://github.com/zankner/CLoud

Upvote
4

  • ankner/Llama3-8B-CLoud-RM

    8B • Updated Oct 16, 2024 • 171 • 1

  • ankner/Llama3-8B-Classic-RM

    8B • Updated Oct 17, 2024 • 3

  • ankner/Llama3-70B-CLoud-RM

    71B • Updated Oct 18, 2024 • 13 • 1

  • ankner/Llama3-70B-Classic-RM

    71B • Updated Oct 18, 2024 • 3

  • ankner/Llama3-8b-ultra-oracle

    Viewer • Updated Sep 5, 2024 • 124k • 24

  • ankner/Llama3-8b-ultra-self-gen-8b

    Viewer • Updated Sep 5, 2024 • 124k • 13

  • ankner/Llama3-8b-ultra-self-gen-70b

    Viewer • Updated Sep 5, 2024 • 124k • 33
Upvote
4
  • Collection guide
  • Browse collections
Company
TOS Privacy About Jobs
Website
Models Datasets Spaces Pricing Docs