FABLES: Evaluating faithfulness and content selection in book-length summarization Paper • 2404.01261 • Published Apr 1, 2024 • 3
PostMark: A Robust Blackbox Watermark for Large Language Models Paper • 2406.14517 • Published Jun 20, 2024
BLEUBERI: BLEU is a surprisingly effective reward for instruction following Paper • 2505.11080 • Published May 16 • 5
CLIPPER: Compression enables long-context synthetic data generation Paper • 2502.14854 • Published Feb 20 • 10
RankGen: Improving Text Generation with Large Ranking Models Paper • 2205.09726 • Published May 19, 2022 • 1
BooookScore: A systematic exploration of book-length summarization in the era of LLMs Paper • 2310.00785 • Published Oct 1, 2023