Haili TIAN
haili-tian
AI & ML interests
None yet
Recent Activity
new activity
1 day ago
deepseek-ai/DeepSeek-R1:Lite version for DeepSeek-R1?
new activity
6 days ago
deepseek-ai/DeepSeek-R1-Distill-Llama-70B:weight files naming is not regular rule
new activity
6 days ago
deepseek-ai/DeepSeek-R1-Distill-Qwen-32B:weight files naming is not regular rule
Organizations
None yet
haili-tian's activity
Lite version for DeepSeek-R1?
#137 opened 1 day ago
by
haili-tian
weight files naming is not regular rule
#13 opened 6 days ago
by
haili-tian
weight files naming is not regular rule
#29 opened 6 days ago
by
haili-tian
bos_token_id is defined incorrectly
1
#28 opened 6 days ago
by
haili-tian
System Prompt
17
#2 opened 22 days ago
by
Wanfq
![](https://cdn-avatars.huggingface.co/v1/production/uploads/62ecbffd99112e99c5f7fded/U6iXAJbpm2vaC5qksEPiH.png)
What temp are these expected to be used at?
2
#6 opened 21 days ago
by
rombodawg
![](https://cdn-avatars.huggingface.co/v1/production/uploads/642cc1c253e76b4c2286c58e/fGtQ_QeTjUgBhIT89dpUt.jpeg)
running on local machine
7
#19 opened 14 days ago
by
saidavanam
System Prompt
13
#2 opened 22 days ago
by
Wanfq
![](https://cdn-avatars.huggingface.co/v1/production/uploads/62ecbffd99112e99c5f7fded/U6iXAJbpm2vaC5qksEPiH.png)
Can not use HF transformers for inference?
#11 opened 4 months ago
by
haili-tian
max_window_layers is 70?
2
#1 opened 5 months ago
by
haili-tian
sliding_window is null?
1
#84 opened 5 months ago
by
haili-tian
Qwen1.5 series, I choose Qwen1.5-32B
#3 opened 9 months ago
by
haili-tian
Qwen1.5-32B?
#4 opened 9 months ago
by
haili-tian