unsloth
/

Qwen3-Coder-30B-A3B-Instruct-1M-GGUF

@@ -1,6 +1,8 @@
 ---
 tags:
 - unsloth
 base_model:
 - Qwen/Qwen3-Coder-30B-A3B-Instruct
 library_name: transformers
@@ -9,12 +11,17 @@ license_link: https://huggingface.co/Qwen/Qwen3-Coder-30B-A3B-Instruct/blob/main
 pipeline_tag: text-generation
 ---
 > [!NOTE]
->  Includes Unsloth **chat template fixes**! <br> For `llama.cpp`, use `--jinja`
 >
 <div>
 <p style="margin-top: 0;margin-bottom: 0;">
-    <em><a href="https://docs.unsloth.ai/basics/unsloth-dynamic-v2.0-gguf">Unsloth Dynamic 2.0</a> achieves superior accuracy & outperforms other leading quants.</em>
   </p>
   <div style="display: flex; gap: 5px; align-items: center; ">
     <a href="https://github.com/unslothai/unsloth/">
@@ -23,37 +30,50 @@ pipeline_tag: text-generation
     <a href="https://discord.gg/unsloth">
       <img src="https://github.com/unslothai/unsloth/raw/main/images/Discord%20button.png" width="173">
     </a>
-    <a href="https://docs.unsloth.ai/">
       <img src="https://raw.githubusercontent.com/unslothai/unsloth/refs/heads/main/images/documentation%20green%20button.png" width="143">
     </a>
   </div>
 </div>
-# Qwen3-Coder-3B-A3B-Instruct
 <a href="https://chat.qwen.ai/" target="_blank" style="margin: 2px;">
     <img alt="Chat" src="https://img.shields.io/badge/%F0%9F%92%9C%EF%B8%8F%20Qwen%20Chat%20-536af5" style="display: inline-block; vertical-align: middle;"/>
 </a>
 ## Highlights
-**Qwen3-Coder** is available in multiple sizes. Today, we're excited to introduce **Qwen3-Coder-30B-A3B-Instruct**. This streamlined model maintains impressive performance and efficiency, featuring the following key enhancements:
-- **Significant Performance** among open models on **Agentic Coding**, **Agentic Browser-Use**, and other foundational coding tasks.
 - **Long-context Capabilities** with native support for **256K** tokens, extendable up to **1M** tokens using Yarn, optimized for repository-scale understanding.
-- **Agentic Coding** supporting for most platform such as **Qwen Code**, **CLINE**, featuring a specially designed function call format.
-![image/jpeg](placeholder of Qwen3-Coder-30B-A3B-Instruct performance image )
 ## Model Overview
-**Qwen3-Coder-30B-A3B-Instruct** has the following features:
 - Type: Causal Language Models
 - Training Stage: Pretraining & Post-training
-- Number of Parameters: 30.5B in total and 3.3B activated
-- Number of Layers: 48
-- Number of Attention Heads (GQA): 32 for Q and 4 for KV
-- Number of Experts: 128
 - Number of Activated Experts: 8
 - Context Length: **262,144 natively**.
@@ -75,7 +95,7 @@ The following contains a code snippet illustrating how to use the model generate
 ```python
 from transformers import AutoModelForCausalLM, AutoTokenizer
-model_name = "Qwen/Qwen3-Coder-30B-A3B-Instruct"
 # load the tokenizer and the model
 tokenizer = AutoTokenizer.from_pretrained(model_name)
@@ -156,7 +176,7 @@ messages = [{'role': 'user', 'content': 'square the number 1024'}]
 completion = client.chat.completions.create(
     messages=messages,
-    model="Qwen3-Coder-30B-A3B-Instruct",
     max_tokens=65536,
     tools=tools,
 )
@@ -188,4 +208,4 @@ If you find our work helpful, feel free to give us a cite.
       primaryClass={cs.CL},
       url={https://arxiv.org/abs/2505.09388},
 }
-```

 ---
 tags:
 - unsloth
+- qwen3
+- qwen
 base_model:
 - Qwen/Qwen3-Coder-30B-A3B-Instruct
 library_name: transformers
 pipeline_tag: text-generation
 ---
 > [!NOTE]
+>  Extends context length from 256K to 1 million
 >
 <div>
+  <p style="margin-bottom: 0; margin-top: 0;">
+    <strong>See <a href="https://huggingface.co/collections/unsloth/qwen3-680edabfb790c8c34a242f95">our collection</a> for all versions of Qwen3 including GGUF, 4-bit & 16-bit formats.</strong>
+  </p>
+  <p style="margin-bottom: 0;">
+    <em>Learn to run Qwen3-Coder correctly - <a href="https://docs.unsloth.ai/basics/qwen3-coder">Read our Guide</a>.</em>
+  </p>
 <p style="margin-top: 0;margin-bottom: 0;">
+   <em>See <a href="https://docs.unsloth.ai/basics/unsloth-dynamic-v2.0-gguf">Unsloth Dynamic 2.0 GGUFs</a> for our quantization benchmarks.</em>
   </p>
   <div style="display: flex; gap: 5px; align-items: center; ">
     <a href="https://github.com/unslothai/unsloth/">
     <a href="https://discord.gg/unsloth">
       <img src="https://github.com/unslothai/unsloth/raw/main/images/Discord%20button.png" width="173">
     </a>
+    <a href="https://docs.unsloth.ai/basics/qwen3-coder">
       <img src="https://raw.githubusercontent.com/unslothai/unsloth/refs/heads/main/images/documentation%20green%20button.png" width="143">
     </a>
   </div>
+<h1 style="margin-top: 0rem;">✨ Read our Qwen3-Coder Guide <a href="https://docs.unsloth.ai/basics/qwen3-coder">here</a>!</h1>
 </div>
+- Fine-tune Qwen3 (14B) for free using our Google [Colab notebook](https://docs.unsloth.ai/get-started/unsloth-notebooks)!
+- Read our Blog about Qwen3 support: [unsloth.ai/blog/qwen3](https://unsloth.ai/blog/qwen3)
+- View the rest of our notebooks in our [docs here](https://docs.unsloth.ai/get-started/unsloth-notebooks).
+- Run & export your fine-tuned model to Ollama, llama.cpp or HF.
+| Unsloth supports          |    Free Notebooks                                                                                           | Performance | Memory use |
+|-----------------|--------------------------------------------------------------------------------------------------------------------------|-------------|----------|
+| **Qwen3 (14B)**      | [▶️ Start on Colab](https://docs.unsloth.ai/get-started/unsloth-notebooks)               | 3x faster | 70% less |
+| **GRPO with Qwen3 (8B)**      | [▶️ Start on Colab](https://docs.unsloth.ai/get-started/unsloth-notebooks)               | 3x faster | 80% less |
+| **Llama-3.2 (3B)**      | [▶️ Start on Colab](https://colab.research.google.com/github/unslothai/notebooks/blob/main/nb/Llama3.2_(1B_and_3B)-Conversational.ipynb)               | 2.4x faster | 58% less |
+| **Llama-3.2 (11B vision)**      | [▶️ Start on Colab](https://colab.research.google.com/github/unslothai/notebooks/blob/main/nb/Llama3.2_(11B)-Vision.ipynb)               | 2x faster | 60% less |
+| **Qwen2.5 (7B)**      | [▶️ Start on Colab](https://colab.research.google.com/github/unslothai/notebooks/blob/main/nb/Qwen2.5_(7B)-Alpaca.ipynb)               | 2x faster | 60% less |
+# Qwen3-Coder-480B-A35B-Instruct
 <a href="https://chat.qwen.ai/" target="_blank" style="margin: 2px;">
     <img alt="Chat" src="https://img.shields.io/badge/%F0%9F%92%9C%EF%B8%8F%20Qwen%20Chat%20-536af5" style="display: inline-block; vertical-align: middle;"/>
 </a>
 ## Highlights
+Today, we're announcing **Qwen3-Coder**, our most agentic code model to date. **Qwen3-Coder** is available in multiple sizes, but we're excited to introduce its most powerful variant first: **Qwen3-Coder-480B-A35B-Instruct**. featuring the following key enhancements:
+- **Significant Performance** among open models on **Agentic Coding**, **Agentic Browser-Use**, and other foundational coding tasks, achieving results comparable to Claude Sonnet.
 - **Long-context Capabilities** with native support for **256K** tokens, extendable up to **1M** tokens using Yarn, optimized for repository-scale understanding.
+- **Agentic Coding** supporting for most platfrom such as **Qwen Code**, **CLINE**, featuring a specially designed function call format.
+![image/jpeg](https://qianwen-res.oss-cn-beijing.aliyuncs.com/Qwen3-Coder/qwen3-coder-main.jpg)
 ## Model Overview
+**Qwen3-480B-A35B-Instruct** has the following features:
 - Type: Causal Language Models
 - Training Stage: Pretraining & Post-training
+- Number of Parameters: 480B in total and 35B activated
+- Number of Layers: 62
+- Number of Attention Heads (GQA): 96 for Q and 8 for KV
+- Number of Experts: 160
 - Number of Activated Experts: 8
 - Context Length: **262,144 natively**.
 ```python
 from transformers import AutoModelForCausalLM, AutoTokenizer
+model_name = "Qwen/Qwen3-480B-A35B-Instruct"
 # load the tokenizer and the model
 tokenizer = AutoTokenizer.from_pretrained(model_name)
 completion = client.chat.completions.create(
     messages=messages,
+    model="Qwen3-480B-A35B-Instruct",
     max_tokens=65536,
     tools=tools,
 )
       primaryClass={cs.CL},
       url={https://arxiv.org/abs/2505.09388},
 }
+```