Update README.md
Browse files
README.md
CHANGED
@@ -14,6 +14,7 @@ tags:
|
|
14 |
- chat
|
15 |
- conversational
|
16 |
- reasoning
|
|
|
17 |
inference:
|
18 |
parameters:
|
19 |
temperature: 0
|
@@ -24,6 +25,18 @@ widget:
|
|
24 |
library_name: transformers
|
25 |
---
|
26 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
27 |
# Phi-4-reasoning-plus Model Card
|
28 |
|
29 |
[Phi-4-reasoning Technical Report](https://huggingface.co/papers/2504.21318)
|
|
|
14 |
- chat
|
15 |
- conversational
|
16 |
- reasoning
|
17 |
+
- vllm
|
18 |
inference:
|
19 |
parameters:
|
20 |
temperature: 0
|
|
|
25 |
library_name: transformers
|
26 |
---
|
27 |
|
28 |
+
# Sharded weights checkpoints
|
29 |
+
|
30 |
+
This is derived directly from [`save_sharded_state.py`](https://github.com/vllm-project/vllm/blob/main/examples/offline_inference/save_sharded_state.py) to be used with vLLM with `-tp=2`:
|
31 |
+
|
32 |
+
```bash
|
33 |
+
vllm serve aarnphm/phi-4-reasoning-plus-sharded-tp2 \
|
34 |
+
-tp=2 \
|
35 |
+
--load-format sharded_state
|
36 |
+
```
|
37 |
+
|
38 |
+
---
|
39 |
+
|
40 |
# Phi-4-reasoning-plus Model Card
|
41 |
|
42 |
[Phi-4-reasoning Technical Report](https://huggingface.co/papers/2504.21318)
|