Update README.md

Files changed (1) hide show

README.md CHANGED Viewed

@@ -14,6 +14,7 @@ tags:
 - chat
 - conversational
 - reasoning
 inference:
   parameters:
     temperature: 0
@@ -24,6 +25,18 @@ widget:
 library_name: transformers
 ---
 # Phi-4-reasoning-plus Model Card
 [Phi-4-reasoning Technical Report](https://huggingface.co/papers/2504.21318)

 - chat
 - conversational
 - reasoning
+- vllm
 inference:
   parameters:
     temperature: 0
 library_name: transformers
 ---
+# Sharded weights checkpoints
+This is derived directly from [`save_sharded_state.py`](https://github.com/vllm-project/vllm/blob/main/examples/offline_inference/save_sharded_state.py) to be used with vLLM with `-tp=2`:
+```bash
+vllm serve aarnphm/phi-4-reasoning-plus-sharded-tp2 \
+      -tp=2 \
+      --load-format sharded_state
+```
+---
 # Phi-4-reasoning-plus Model Card
 [Phi-4-reasoning Technical Report](https://huggingface.co/papers/2504.21318)