Qwen/QwQ-32B · Discussions

#68 opened 4 months ago by

rastegar

Sliding Window Attention

#67 opened 4 months ago by

cferggie

Update README.md

#66 opened 4 months ago by

sujiij

使用react agent解决复杂问题时只会调用一次函数

#64 opened 4 months ago by

tyler1990

reasoning dead loop.

#63 opened 4 months ago by

creampx

Seeking Advice on Fine-tuning QWQ-32B Model

#62 opened 4 months ago by

aaditya

Steps to deploy a production ready service for QwQ on AWS using serverless GPUs

🔥 7

#61 opened 4 months ago by

samagra14

Etude

#60 opened 5 months ago by

Nino-ogvng

Upload gen_ai(proj_47).ipynb

#59 opened 5 months ago by

harshith1411

How does LiveCodeBench test?

#58 opened 5 months ago by

cizhenshi

Create test

#57 opened 5 months ago by

Amyww

Upload 99ed5d4766696bd4ebc26e5d9c23e982.png

#56 opened 5 months ago by

axingd

Best practice for QwQ-32B evaluation

🚀 👍 5

#55 opened 5 months ago by

wangxingjun778

Create test.txt

#54 opened 5 months ago by

xxxx443117

How to continue pt / sft on this model，any suggestions?

4

#53 opened 5 months ago by

Ken0102030405

Infinite repetitive thinking for this case:

#52 opened 5 months ago by

zhaocc1106

Is QwQ-32B used for custom NER extraction? Does this model work better compared to other open-source BERT transformers?

#51 opened 5 months ago by

amiirhmza

Budget forcing?

#50 opened 5 months ago by

mwettach

Allow prefilling assistant message

#49 opened 5 months ago by

tomasmcm

Update README.md

#48 opened 5 months ago by

Bschleter

Day of the week

👍 2

#47 opened 5 months ago by

jac-jim

Intermittent CUDA error with model.generate() using device_map="auto" and 3 GPUs

#46 opened 5 months ago by

lucmaz98

Create Call Center Tunaiku 0818836245

#45 opened 5 months ago by

Jokiio

Does Macbook M1 max 64GB run this model well?

#44 opened 5 months ago by

mrk83

Too many "cross-validate" and "another method"

👍 4

#43 opened 5 months ago by

AaronFeng753

RuntimeError: Error(s) in loading state_dict for Qwen2ForCausalLM:

#42 opened 5 months ago by

XuehangCang

8GB GPU can run this,10t/s

#41 opened 5 months ago by

wqerrewetw

When answering questions in Chinese, the model frequently terminates prematurely (outputs the end token). Is this a common problem?

#40 opened 5 months ago by

zhangw355

Refining QWQ Model Output: Direct Responses Without Step-by-Step Reasoning

➕ 6

#39 opened 5 months ago by

gslinx

It's challenging for QwQ to generate long codes...

#38 opened 5 months ago by

DXBTR74

Nice work... Cant-believe-its-just-32B-performance even with various different tones system prompt.

#37 opened 5 months ago by

imoc

function call时有办法跳过think吗？

👍 1

#36 opened 5 months ago by

zhaocc1106

Failed to parse Jinja template:

#35 opened 5 months ago by

Vicnent

Obligatory question about model sizes...

#34 opened 5 months ago by

MrDevolver

This model beats Qwen Max!

👍 1

7

#33 opened 5 months ago by

MrDevolver

remove part about long context modifications

#32 opened 5 months ago by

nbroad

add a reasoning effort option