YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

Qwen3-4B-Thinking-2507

Model Description

Qwen3-4B-Thinking-2507 is a 4-billion-parameter causal language model tuned for deep, structured reasoning. It runs exclusively in "thinking mode"—automatically revealing its chain-of-thought in outputs—without needing any special flags. The model supports a massive native context window of 262,144 tokens, great for multi-step logic, academic tasks, math, and code.

Features

  • Explicit reasoning: Outputs contain intermediate steps, enclosed in thinking tags, to improve transparency and interpretability.
  • Massive context: Handles up to 262,144 tokens natively.
  • Advanced reasoning: Excels in logic, math, science, coding, and academic benchmarks.
  • General capability uplift: Strengthened instruction following, tool use, text generation, and human preference alignment.

Use Cases

  • Explaining complex problems with clear reasoning workflow
  • Academic or STEM tutoring applications
  • Code generation with logic transparency
  • Agents that need to show how they think through a query
  • Processing lengthy documents with deep inference

Inputs and Outputs

Input:

  • Natural language problems, coding tasks, or academic questions that benefit from step-by-step decomposition.

Output:

  • Structured chain-of-thought (with <think>…</think> tags), followed by final answer or solution.
  • Note: The default template auto-inserts thinking behavior, so you may see only a closing </think> tag.

How to use

⚠️ Hardware requirement: the model currently runs only on Qualcomm NPUs (e.g., Snapdragon-powered AIPC).
Apple NPU support is planned next.

1) Install Nexa-SDK

  • Download and follow the steps under "Deploy Section" Nexa's model page: Download Windows arm64 SDK
  • (Other platforms coming soon)

2) Get an access token

Create a token in the Model Hub, then log in:

nexa config set license '<access_token>'

3) Run the model

Running:

nexa infer NexaAI/Qwen3-4B-Thinking-2507-npu

License

  • Licensed under Apache-2.0

References

Downloads last month
375
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Collection including NexaAI/Qwen3-4B-Thinking-2507-npu