ljvmiranda921 commited on
Commit
fde8a18
·
verified ·
1 Parent(s): 1d8af28

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +34 -0
README.md CHANGED
@@ -34,6 +34,40 @@ evaluation rubrics, and a score along with the corresponding reasoning.
34
  - **Repository:** https://github.com/rubricreward/r3
35
  - **Paper:** https://arxiv.org/abs/2505.13388
36
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
37
  ## License and use
38
 
39
  R3 is licensed under the Apache 2.0 license.
 
34
  - **Repository:** https://github.com/rubricreward/r3
35
  - **Paper:** https://arxiv.org/abs/2505.13388
36
 
37
+ ## Using the Model
38
+
39
+
40
+ ```python
41
+ from transformers import AutoTokenizer
42
+ from vllm import LLM, SamplingParams
43
+
44
+ model_path = "rubricreward/R3-Qwen3-8B-4k"
45
+ tokenizer = AutoTokenizer.from_pretrained(model_path)
46
+ sampling_params = SamplingParams(temperature=0.6, top_p=0.95, max_tokens=8192, min_p=0, top_k=20)
47
+
48
+ llm = LLM(
49
+ model=model_path,
50
+ dtype="bfloat16",
51
+ max_model_len=10000,
52
+ tensor_parallel_size=2,
53
+ gpu_memory_utilization=0.9,
54
+ enforce_eager=True,
55
+ )
56
+
57
+ messages: list[dict[str, str]] = [
58
+ {'content': "Evaluate the response based on the given task, input, response, and evaluation rubric. Provide a fair and detailed assessment following the rubric...", 'role': 'user'}
59
+ ]
60
+
61
+ list_text = tokenizer.apply_chat_template(
62
+ messages,
63
+ tokenize=False,
64
+ add_generation_prompt=True,
65
+ enable_thinking=True # Switch between thinking and non-thinking modes.
66
+ )
67
+
68
+ outputs = llm.generate(list_text, sampling_params)
69
+ ```
70
+
71
  ## License and use
72
 
73
  R3 is licensed under the Apache 2.0 license.