Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up
mdouglas 's Collections
Datasets: NeurIPS LLM Challenge 2023
Papers
Papers: GEC/Revision
Papers: Instruct
Papers: MoE/Ensemble
Papers: PEFT
Papers: Evaluation
Papers: Models
Papers: Quantization
Papers: Pruning
Papers: LLM as a Judge
Reading List
llm.c

Papers: LLM as a Judge

updated Apr 10, 2024
Upvote
-

  • JudgeLM: Fine-tuned Large Language Models are Scalable Judges

    Paper • 2310.17631 • Published Oct 26, 2023 • 35

  • Judging LLM-as-a-judge with MT-Bench and Chatbot Arena

    Paper • 2306.05685 • Published Jun 9, 2023 • 34

  • G-Eval: NLG Evaluation using GPT-4 with Better Human Alignment

    Paper • 2303.16634 • Published Mar 29, 2023 • 3

  • Prometheus: Inducing Fine-grained Evaluation Capability in Language Models

    Paper • 2310.08491 • Published Oct 12, 2023 • 55

  • LLM-Eval: Unified Multi-Dimensional Automatic Evaluation for Open-Domain Conversations with Large Language Models

    Paper • 2305.13711 • Published May 23, 2023 • 2

  • Leveraging Large Language Models for NLG Evaluation: A Survey

    Paper • 2401.07103 • Published Jan 13, 2024 • 4
Upvote
-
  • Collection guide
  • Browse collections
Company
TOS Privacy About Jobs
Website
Models Datasets Spaces Pricing Docs