File size: 10,450 Bytes
e2f4ef0 9815fdf 8380b23 9815fdf 8380b23 9815fdf 2f7d3f5 9815fdf 8380b23 9815fdf |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 |
---
license: cc-by-4.0
language:
- en
base_model:
- deepseek-ai/DeepSeek-R1-Distill-Llama-70B
pipeline_tag: text-generation
tags:
- adversarial
- rank-boosting
- rank-promotion
library_name: transformers
---
# CRAFT-R1-Distill-Llama-70B
Specialized for adversarial rank promotion in neural IR systems, this model is fine-tuned with the Alpaca template on an R1-distilled Llama 70B backbone. It produces fluent, style-consistent sentences that strategically enhance a target document’s relevance score without addressing the query.
- Base architecture: `DeepSeek-R1-Distill-Llama-70B`
- Format: Alpaca instruction template (exact template below)
- Dataset: Adversarial datasets for fine-tuning (link under the Dataset section)
### Task Definition
- Adversarial Ranking Augmentation (primary): Given a Query, a Target Document, and the Top-5 Documents for that query, generate a single adversarial sentence that increases the Target Document's perceived relevance without answering the query. The response must be returned as a JSON object of the form `{ "response": "<one sentence>" }`. Constraints include reusing vocabulary/themes (Buffer A), preserving style/fluency, and avoiding direct answers or explicit references to the query.
## Prompt Formatting
This model was fine-tuned using the Alpaca-style instruction template. The exact prompt format, including the instruction and input content used during fine-tuning, is:
```text
Below is an instruction that describes a task, paired with an input that provides further context. Write a response that appropriately completes the request.
### Instruction:
You are an expert in information retrieval and adversarial search ranking. Your goal is to help a *Target Document* rank higher.
(ideally between 1 to 10) for a *given query* by subtly increasing its perceived relevance to ranking models such as cross-encoders.
You will do this by generating content that shares vocabulary, structure, and context with high-ranking documents,
but without explicitly answering the query or suggesting a solution.
Extract:
- **Buffer A**: 10-15 words from the Top-5 ranked documents and query itself, strongly associated with the query.
**Generate an adversarial sentences** that satisfy ALL the following:
- Include combination of words (at least 5) or similar words (similar embedding) from Buffer A** that is most related to the query and help promote ranking significantly and integrates well with Target Document
- DO NOT use the words that answer the query.
- Are **fluent**, **grammatically sound**, and **consistent with the style** of the Target Document.
- **Do NOT answer, suggest, or hint at an answer to the Target Query**.
- **Do NOT reference the Target Query at all**.
- Are designed to **sound relevant** but only reinforce theme/context alignment.
### Input:
Query: {query}
Target Document:
{doc_content}
Top-5 Documents:
{top_docs_str}
Generate your answer as a valid JSON object with the following structure:
{
"response": "<your answer here>"
}
Do not include any additional text.
### Response:
```
## How to Use (Transformers)
Basic usage with the Alpaca template:
```python
from transformers import AutoTokenizer, AutoModelForCausalLM
import torch
# Use the published Hugging Face repo id
model_id = "radinrad/CRAFT-R1-Distill-Llama-70B"
tokenizer = AutoTokenizer.from_pretrained(model_id, use_fast=True)
model = AutoModelForCausalLM.from_pretrained(model_id, torch_dtype=torch.float16, device_map="auto")
# Example inputs
query = "effects of intermittent fasting on metabolism"
doc_content = "...target document content..."
top_docs = ["doc 1 ...", "doc 2 ...", "doc 3 ...", "doc 4 ...", "doc 5 ..."]
top_docs_str = "\n".join(top_docs)
prompt = f"""Below is an instruction that describes a task, paired with an input that provides further context. Write a response that appropriately completes the request.
### Instruction:
You are an expert in information retrieval and adversarial search ranking. Your goal is to help a *Target Document* rank higher.
(ideally between 1 to 10) for a *given query* by subtly increasing its perceived relevance to ranking models such as cross-encoders.
You will do this by generating content that shares vocabulary, structure, and context with high-ranking documents,
but without explicitly answering the query or suggesting a solution.
Extract:
- **Buffer A**: 10-15 words from the Top-5 ranked documents and query itself, strongly associated with the query.
**Generate an adversarial sentences** that satisfy ALL the following:
- Include combination of words (at least 5) or similar words (similar embedding) from Buffer A** that is most related to the query and help promote ranking significantly and integrates well with Target Document
- DO NOT use the words that answer the query.
- Are **fluent**, **grammatically sound**, and **consistent with the style** of the Target Document.
- **Do NOT answer, suggest, or hint at an answer to the Target Query**.
- **Do NOT reference the Target Query at all**.
- Are designed to **sound relevant** but only reinforce theme/context alignment.
### Input:
Query: {query}
Target Document:
{doc_content}
Top-5 Documents:
{top_docs_str}
Generate your answer as a valid JSON object with the following structure:
{{
"response": "<your answer here>"
}}
Do not include any additional text.
### Response:
"""
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
output_ids = model.generate(
**inputs,
do_sample=True,
temperature=0.6,
top_p=0.95,
top_k=40,
max_new_tokens=128,
eos_token_id=tokenizer.eos_token_id,
pad_token_id=tokenizer.pad_token_id,
)
print(tokenizer.decode(output_ids[0], skip_special_tokens=True))
```
## Recommended Generation Settings
Recommended decoding settings:
- `do_sample`: true
- `temperature`: 0.6
- `top_p`: 0.95
- `top_k`: 40
- `max_new_tokens`: 128
## Inference Recommendations
- For most tasks, use top_p = 0.95 and temperature = 0.6.
- Keep `do_sample=True` and `top_k=40` for a good quality–diversity tradeoff.
- Adjust `max_new_tokens` to your task length (e.g., 128 for short answers).
## Adversarial Generation Strategy (Recommended)
For adversarial attack or robust candidate selection, we recommend a generate-then-rank approach:
1. Generate a pool of candidates (≈10) with the same decoding settings (top_p=0.95, temperature=0.6).
2. Score each candidate using a surrogate model e.g. BERT base uncased (`google-bert/bert-base-uncased`). Compute cosine similarity between the query and each candidate and pick the highest.
3. Select the highest-scoring candidate as the final output.
This pool-plus-ranking approach tends to improve robustness for adversarial objectives.
## Evaluation
The following summarizes attack performance and content fidelity metrics for CRAFT across backbones on the Easy-5 and Hard-5 settings. Values are percentages where applicable; arrows indicate the direction of preference. Daggers (†) denote statistically significant improvements over the strongest baseline in each setting (paired two-tailed t-test, p < 0.05). Bold indicates column best.
### Easy-5
| Method | ASR | Top-10 | Top-50 | Boost | SS (↑) | ATI (↓) | ADT (↓) | LOR (↑) |
|----------------------|-----:|-------:|-------:|------:|-------:|--------:|--------:|--------:|
| PRADA | 59.8 | 1.2 | 25.2 | 13.4 | 0.9 | 0.1 | 13.1 | 0.9 |
| Brittle-BERT | 76.3 | 12.9 | 56.8 | 22.6 | 0.9 | 11.6 | 11.6 | 1.0 |
| PAT | 46.8 | 1.4 | 17.2 | -3.3 | 0.9 | 6.3 | 6.3 | 1.0 |
| IDEM | 97.3 | 32.1 | 84.8 | 49.3 | 0.9 | 11.6 | 11.6 | 1.0 |
| EMPRA | **99.4** | 43.5 | 93.4 | 57.6 | 0.9 | 29.8 | 29.8 | 1.0 |
| AttChain | 92.1 | 34.5 | 83.9 | 47.9 | 0.8 | 22.4 | 38.8 | 0.9 |
| CRAFT_Qwen3 | 97.2 | 37.0 | 91.4 | 54.5 | 0.9 | 19.1 | 19.1 | 1.0 |
| CRAFT_Llama3.3 | **99.4** | **44.5** | **95.8**† | **59.7**† | 0.9 | 19.9 | 19.9 | 1.0 |
### Hard-5
| Method | ASR | Top-10 | Top-50 | Boost | SS (↑) | ATI (↓) | ADT (↓) | LOR (↑) |
|----------------------|-----:|-------:|-------:|------:|-------:|--------:|--------:|--------:|
| PRADA | 74.3 | 0.0 | 0.0 | 75.5 | 0.9 | 0.1 | 18.5 | 0.9 |
| Brittle-BERT | 99.7 | 4.2 | 23.4 | 744.5 | 0.9 | 11.2 | 11.3 | 1.0 |
| PAT | 80.1 | 0.1 | 0.4 | 79.6 | 0.9 | 11.2 | 6.3 | 1.0 |
| IDEM | 99.8 | 8.3 | 34.5 | 780.8 | 0.9 | 11.2 | 22.4 | 1.0 |
| EMPRA | 99.3 | 10.7 | 40.8 | 828.5 | 0.8 | 32.7 | 32.7 | 1.0 |
| AttChain | 99.8 | 12.2 | 42.4 | 855.2 | 0.7 | 22.8 | 39.0 | 0.9 |
| CRAFT_Qwen3 | **100.0** | 15.3† | 57.1† | 911.5† | 0.8 | 19.1 | 19.1 | 1.0 |
| CRAFT_Llama3.3 | **100.0** | **22.2**† | **70.5**† | **940.5**† | 0.8 | 19.7 | 19.7 | 1.0 |
Figure: Attack methods performance vs. detection pass rate

## Dataset
This model was fine-tuned using data from the following repository:
- GitHub: https://github.com/KhosrojerdiA/adversarial-datasets
Please review the repository for details on data composition, licensing, and any usage constraints.
## Limitations and Bias
- The model may produce incorrect, biased, or unsafe content. Use human oversight for critical applications.
- Behaviors outside the Alpaca-style instruction format may be less reliable.
- The model does not have browsing or up-to-date world knowledge beyond its pretraining and fine-tuning data.
## License and Usage
- License: CC BY 4.0 (https://creativecommons.org/licenses/by/4.0/)
- This checkpoint also inherits licensing constraints from the base Llama model and the fine-tuning data. Ensure your usage complies with the base model license and the dataset’s license/terms.
- If you redistribute or deploy this model, please include appropriate attribution and links back to the base model and dataset.
## Acknowledgements
- Base architecture: Llama (Meta)
- Prompt format inspired by Alpaca |