unsloth
/

Qwen3-Coder-30B-A3B-Instruct-1M-GGUF

@@ -40,8 +40,6 @@ pipeline_tag: text-generation
 - Fine-tune Qwen3 (14B) for free using our Google [Colab notebook](https://docs.unsloth.ai/get-started/unsloth-notebooks)!
 - Read our Blog about Qwen3 support: [unsloth.ai/blog/qwen3](https://unsloth.ai/blog/qwen3)
 - View the rest of our notebooks in our [docs here](https://docs.unsloth.ai/get-started/unsloth-notebooks).
-- Run & export your fine-tuned model to Ollama, llama.cpp or HF.
 | Unsloth supports          |    Free Notebooks                                                                                           | Performance | Memory use |
 |-----------------|--------------------------------------------------------------------------------------------------------------------------|-------------|----------|
 | **Qwen3 (14B)**      | [▶️ Start on Colab](https://docs.unsloth.ai/get-started/unsloth-notebooks)               | 3x faster | 70% less |
@@ -50,30 +48,30 @@ pipeline_tag: text-generation
 | **Llama-3.2 (11B vision)**      | [▶️ Start on Colab](https://colab.research.google.com/github/unslothai/notebooks/blob/main/nb/Llama3.2_(11B)-Vision.ipynb)               | 2x faster | 60% less |
 | **Qwen2.5 (7B)**      | [▶️ Start on Colab](https://colab.research.google.com/github/unslothai/notebooks/blob/main/nb/Qwen2.5_(7B)-Alpaca.ipynb)               | 2x faster | 60% less |
-# Qwen3-Coder-480B-A35B-Instruct
 <a href="https://chat.qwen.ai/" target="_blank" style="margin: 2px;">
     <img alt="Chat" src="https://img.shields.io/badge/%F0%9F%92%9C%EF%B8%8F%20Qwen%20Chat%20-536af5" style="display: inline-block; vertical-align: middle;"/>
 </a>
 ## Highlights
-Today, we're announcing **Qwen3-Coder**, our most agentic code model to date. **Qwen3-Coder** is available in multiple sizes, but we're excited to introduce its most powerful variant first: **Qwen3-Coder-480B-A35B-Instruct**. featuring the following key enhancements:
-- **Significant Performance** among open models on **Agentic Coding**, **Agentic Browser-Use**, and other foundational coding tasks, achieving results comparable to Claude Sonnet.
 - **Long-context Capabilities** with native support for **256K** tokens, extendable up to **1M** tokens using Yarn, optimized for repository-scale understanding.
-- **Agentic Coding** supporting for most platfrom such as **Qwen Code**, **CLINE**, featuring a specially designed function call format.
-![image/jpeg](https://qianwen-res.oss-cn-beijing.aliyuncs.com/Qwen3-Coder/qwen3-coder-main.jpg)
 ## Model Overview
-**Qwen3-480B-A35B-Instruct** has the following features:
 - Type: Causal Language Models
 - Training Stage: Pretraining & Post-training
-- Number of Parameters: 480B in total and 35B activated
-- Number of Layers: 62
-- Number of Attention Heads (GQA): 96 for Q and 8 for KV
-- Number of Experts: 160
 - Number of Activated Experts: 8
 - Context Length: **262,144 natively**.
@@ -95,7 +93,7 @@ The following contains a code snippet illustrating how to use the model generate
 ```python
 from transformers import AutoModelForCausalLM, AutoTokenizer
-model_name = "Qwen/Qwen3-480B-A35B-Instruct"
 # load the tokenizer and the model
 tokenizer = AutoTokenizer.from_pretrained(model_name)
@@ -176,7 +174,7 @@ messages = [{'role': 'user', 'content': 'square the number 1024'}]
 completion = client.chat.completions.create(
     messages=messages,
-    model="Qwen3-480B-A35B-Instruct",
     max_tokens=65536,
     tools=tools,
 )

 - Fine-tune Qwen3 (14B) for free using our Google [Colab notebook](https://docs.unsloth.ai/get-started/unsloth-notebooks)!
 - Read our Blog about Qwen3 support: [unsloth.ai/blog/qwen3](https://unsloth.ai/blog/qwen3)
 - View the rest of our notebooks in our [docs here](https://docs.unsloth.ai/get-started/unsloth-notebooks).
 | Unsloth supports          |    Free Notebooks                                                                                           | Performance | Memory use |
 |-----------------|--------------------------------------------------------------------------------------------------------------------------|-------------|----------|
 | **Qwen3 (14B)**      | [▶️ Start on Colab](https://docs.unsloth.ai/get-started/unsloth-notebooks)               | 3x faster | 70% less |
 | **Llama-3.2 (11B vision)**      | [▶️ Start on Colab](https://colab.research.google.com/github/unslothai/notebooks/blob/main/nb/Llama3.2_(11B)-Vision.ipynb)               | 2x faster | 60% less |
 | **Qwen2.5 (7B)**      | [▶️ Start on Colab](https://colab.research.google.com/github/unslothai/notebooks/blob/main/nb/Qwen2.5_(7B)-Alpaca.ipynb)               | 2x faster | 60% less |
+# Qwen3-Coder-3B-A3B-Instruct
 <a href="https://chat.qwen.ai/" target="_blank" style="margin: 2px;">
     <img alt="Chat" src="https://img.shields.io/badge/%F0%9F%92%9C%EF%B8%8F%20Qwen%20Chat%20-536af5" style="display: inline-block; vertical-align: middle;"/>
 </a>
 ## Highlights
+**Qwen3-Coder** is available in multiple sizes. Today, we're excited to introduce **Qwen3-Coder-30B-A3B-Instruct**. This streamlined model maintains impressive performance and efficiency, featuring the following key enhancements:
+- **Significant Performance** among open models on **Agentic Coding**, **Agentic Browser-Use**, and other foundational coding tasks.
 - **Long-context Capabilities** with native support for **256K** tokens, extendable up to **1M** tokens using Yarn, optimized for repository-scale understanding.
+- **Agentic Coding** supporting for most platform such as **Qwen Code**, **CLINE**, featuring a specially designed function call format.
+![image/jpeg](placeholder of Qwen3-Coder-30B-A3B-Instruct performance image )
 ## Model Overview
+**Qwen3-Coder-30B-A3B-Instruct** has the following features:
 - Type: Causal Language Models
 - Training Stage: Pretraining & Post-training
+- Number of Parameters: 30.5B in total and 3.3B activated
+- Number of Layers: 48
+- Number of Attention Heads (GQA): 32 for Q and 4 for KV
+- Number of Experts: 128
 - Number of Activated Experts: 8
 - Context Length: **262,144 natively**.
 ```python
 from transformers import AutoModelForCausalLM, AutoTokenizer
+model_name = "Qwen/Qwen3-Coder-30B-A3B-Instruct"
 # load the tokenizer and the model
 tokenizer = AutoTokenizer.from_pretrained(model_name)
 completion = client.chat.completions.create(
     messages=messages,
+    model="Qwen3-Coder-30B-A3B-Instruct",
     max_tokens=65536,
     tools=tools,
 )