RabotniKuma
/

Fast-Math-R1-14B

Model card Files Files and versions Community

RabotniKuma commited on 8 days ago

Commit

cbb3d80

·

verified ·

1 Parent(s): cc6b91b

Update README.md

Files changed (1) hide show

README.md +22 -3

README.md CHANGED Viewed

@@ -43,20 +43,39 @@ Technical details can be found in [Kaggle Discussion](https://www.kaggle.com/com
 ## vLLM
 ```python
 from vllm import LLM, SamplingParams
 vllm_engine = LLM(
-    model='RabotniKuma/Fast-Math-R1-14B',
     max_model_len=8192,
     gpu_memory_utilization=0.9,
     trust_remote_code=True,
 )
 sampling_params = SamplingParams(
     temperature=1.0,
     top_p=0.90,
     min_p=0.05,
     max_tokens=8192,
-    stop='</think>',  # Important: early stop at </think> to save output tokens
 )
-vllm_engine.generate('1+1=', sampling_params=sampling_params)
 ```

 ## vLLM
 ```python
 from vllm import LLM, SamplingParams
+from transformers import AutoTokenizer
+model_path = 'RabotniKuma/Fast-Math-R1-14B'
 vllm_engine = LLM(
+    model=model_path,
     max_model_len=8192,
     gpu_memory_utilization=0.9,
     trust_remote_code=True,
 )
+tokenizer = AutoTokenizer.from_pretrained(model_path)
 sampling_params = SamplingParams(
     temperature=1.0,
     top_p=0.90,
     min_p=0.05,
     max_tokens=8192,
+    stop='</think>',  # Important!: early stop at </think> to save output tokens
+)
+messages = [
+    {
+        'role': 'user',
+        'content': (
+            'Solve the problem, and put the answer in \boxed{{}}. '
+            'Sarah is twice as old as her youngest brother. If the difference between their ages is 15 years. How old is her youngest brother?'
+        )
+    }
+]
+messages = tokenizer.apply_chat_template(
+    conversation=messages,
+    tokenize=False,
+    add_generation_prompt=True
 )
+response = vllm_engine.generate(messages, sampling_params=sampling_params)
 ```