IFDECORATOR: Wrapping Instruction Following Reinforcement Learning with Verifiable Rewards
This repository contains the Qwen2.5-32B-Instruct-IFDecorator
model, presented in the paper IFDECORATOR: Wrapping Instruction Following Reinforcement Learning with Verifiable Rewards.
IFDecorator introduces a novel framework that significantly improves the efficiency and robustness of Reinforcement Learning with Verifiable Rewards (RLVR) for instruction following in Large Language Models (LLMs). It addresses issues of training inefficiency and over-optimization common in previous RLVR approaches.
Key Innovations:
- Cooperative-Adversarial Data Flywheel: Co-evolves instructions and hybrid verifications to generate progressively more challenging instruction-verification pairs.
- IntentCheck Module: A bypass mechanism designed to enforce alignment with the actual intent of user instructions.
- Trip Wires: A diagnostic tool that detects and captures reward hacking behaviors through trap instructions.
Performance Highlights:
Qwen2.5-32B-Instruct-IFDecorator
achieves 87.43% accuracy on IFEval, outperforming larger proprietary models such as GPT-4o. It also demonstrates substantial improvements on FollowBench while preserving general capabilities and significantly reducing reward hacking rates.
Links
- Paper: https://huggingface.co/papers/2508.04632
- Project Page: https://tianyilt.github.io/ifdecorator
- Code: https://github.com/guox18/IFDecorator
Usage
You can use this model with the Hugging Face transformers
library. Below is a basic example for text generation using the model's chat template:
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch
model_id = "your_model_id_here" # Replace with the actual model ID (e.g., "author/Qwen2.5-32B-Instruct-IFDecorator")
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id, torch_dtype=torch.bfloat16, device_map="auto")
messages = [
{"role": "user", "content": "Explain what Instruction Following Reinforcement Learning with Verifiable Rewards (RLVR) is."},
]
text = tokenizer.apply_chat_template(
messages,
tokenize=False,
add_generation_prompt=True
)
model_inputs = tokenizer(text, return_tensors="pt").to(model.device)
generated_ids = model.generate(
model_inputs.input_ids,
max_new_tokens=512
)
response = tokenizer.decode(generated_ids[0][model_inputs.input_ids.shape[-1]:], skip_special_tokens=True)
print(response)
Citation
If you find this work useful, please cite the paper:
@article{li2025ifdecorator,
title={IFDECORATOR: Wrapping Instruction Following Reinforcement Learning with Verifiable Rewards},
author={Li, Tianyi and Xu, Peng and Huang, Wenhao and Huang, Songlin and Zhou, Chuanxiao and He, Kun and Peng, Shiqi and Gao, Jing and Huang, Jin and Gao, Kai},
journal={arXiv preprint arXiv:2508.04632},
year={2025}
}
- Downloads last month
- 4