🛠️ ReAligner
A flexible realignment framework is proposed to quantitatively control alignment during training and inference, combining Training-time Realignment (TrRa) and Inference-time Realignment (InRa).
- We realign DeepScaleR-1.5B model and reduce token usage without performance loss and even enhance reasoning capabilities.