YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

🛠️ ReAligner

A flexible realignment framework is proposed to quantitatively control alignment during training and inference, combining Training-time Realignment (TrRa) and Inference-time Realignment (InRa).

We realign DeepScaleR-1.5B model and reduce token usage without performance loss and even enhance reasoning capabilities.

Downloads last month: 5

Safetensors

Model size

1.78B params

Tensor type

BF16

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Collection including wh-zhu/DeepSeek-R1-TrRa-1.5B_lambda_0.5

Realigner-TrRa

Collection

7 items • Updated 21 days ago