arxiv:2406.16306

Cascade Reward Sampling for Efficient Decoding-Time Alignment

Published on Jun 24, 2024

Authors:

Bolian Li ,

Abstract

Cascade Reward Sampling (CARDS) improves the efficiency and quality of decoding-time alignment in large language models by minimizing redundant computations through segment-level rejection sampling and uncertainty-based segmentation.

AI-generated summary

Aligning large language models (LLMs) with human preferences is essential for their applications. Recently, decoding-time alignment has emerged as an effective plug-and-play technique that avoids fine-tuning model parameters. This approach retains the general utility of pretrained LLMs but often suffers from significant inefficiencies during decoding, primarily due to wasted token generation and excessive reward evaluations. To address these challenges, we introduce Cascade Reward Sampling (CARDS) to resolve both efficiency bottlenecks in decoding-time alignment. Specifically, we develop a segment-level rejection sampling algorithm that minimizes redundant computations of both LLMs and reward models (RMs). Central to CARDS is an uncertainty-based segmentation mechanism, which ensures the accuracy of RMs evaluations on incomplete segments. Furthermore, we provide a detailed analysis of reward scores on segments to elucidate the improved alignment performance. Experimental results demonstrate that CARDS significantly improves decoding efficiency, alignment quality, and general utility compared to existing decoding-time alignment methods, achieving approximately a 70% reduction in decoding time and over 90% win-ties in utility and safety benchmarks.

View arXiv page View PDF Add to collection

Community

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment

Upvote

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2406.16306 in a model README.md to link it from this page.

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2406.16306 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2406.16306 in a Space README.md to link it from this page.

Collections including this paper 0

No Collection including this paper

Add this paper to a collection to link it from this page.