Hu
Moses25
·
AI & ML interests
None yet
Recent Activity
new activity
30 days ago
deepseek-ai/DeepSeek-R1:输出乱码
new activity
about 2 months ago
Qwen/Qwen2.5-7B-Instruct:vllm 0.6.6加速qwen2.5-7B模型出错
Organizations
Moses25's activity
Adding <think>\n after chat template will cause vllm to not return reasoning_content (null) when reasoning
6
#144 opened about 1 month ago
by
kebeliu
vllm 0.6.6加速qwen2.5-7B模型出错
#15 opened about 2 months ago
by
Moses25
why is the system prompt missing?
1
#87 opened 3 months ago
by
Moses25
input_ids = torch.as_tensor(inputs.input_ids).cuda() 是否有问题?
#37 opened 3 months ago
by
Moses25
where is the video dataset?
1
#5 opened 5 months ago
by
Moses25
[bot] Conversion to Parquet
#1 opened 9 months ago
by
parquet-converter

[bot] Conversion to Parquet
#1 opened 9 months ago
by
parquet-converter

请问template模版更换了么
1
#4 opened 10 months ago
by
okcwang
tokenier_config.json里为什么不添加chat-template呢?
2
#8 opened 10 months ago
by
Moses25
本地启动这个app.py出错
1
#6 opened about 1 year ago
by
Moses25
qwen72B如何训练多轮对话的数据?
#4 opened about 1 year ago
by
Moses25
系统指令在哪设置?
1
#6 opened about 1 year ago
by
Moses25
there is no sliding_window in params.json
1
#41 opened over 1 year ago
by
Moses25
can you show the train process on github?
#5 opened over 1 year ago
by
Moses25
how to add special tokens?
1
#21 opened over 1 year ago
by
Moses25
what is the prompt of instruction?
#9 opened over 1 year ago
by
Moses25
this model can be used for commercial ?
1
#3 opened over 1 year ago
by
Moses25
what is the Orca different from alpaca?
2
#1 opened over 1 year ago
by
Moses25
Prompt tunning in Bloom for long form text generation
9
#149 opened over 2 years ago
by
info2000