🧠 Crystal Think V2 - GGUF Imatrix Quantized ✨

Premium Quality GGUF Quantizations with Importance Matrix Optimization

🔗 Original Model: PinkPixel/Crystal-Think-V2
📦 Quantized by: Pink Pixel
🏷️ License: Apache 2.0
🎯 Special Feature: Importance Matrix Enhanced

📋 About This Repository

This repository contains premium GGUF quantized versions of Crystal Think V2, enhanced with Importance Matrix (imatrix) optimization. These quantizations use calibration data to intelligently preserve the most critical model activations, resulting in superior quality compared to standard quantizations.

🌟 What is Importance Matrix?

Importance Matrix is an advanced quantization technique that:

📊 Analyzes activation patterns using calibration data
🎯 Identifies critical neurons that most impact model performance
🔧 Preserves precision where it matters most
⚡ Maintains efficiency while maximizing quality retention

Result: Better mathematical reasoning performance at the same file sizes! 🚀

🎯 Original Model Features

🧮 Advanced Mathematical Reasoning with enhanced chain-of-thought
📐 Multi-step Problem Solving with clear explanations
💻 Mathematical Code Generation and algorithm explanation
🎯 Enhanced <think></think> Reasoning Format
📊 85.2% GSM8K accuracy (+8.8% over base Qwen3-4B)

📦 Available Imatrix Quantizations

Quantization	File Size	Use Case	Memory Required	Quality vs Standard
IQ4_XS	2.1GB	Ultra-efficient	~5.5GB RAM	+3-5% better
Q4_K_S	2.2GB	Small & fast	~6GB RAM	+2-4% better
IQ4_NL	2.2GB	Natural language optimized	~6GB RAM	+4-6% better
Q4_K_M	2.3GB	Balanced performance	~6.5GB RAM	+3-5% better
Q5_K_S	2.6GB	High quality small	~7GB RAM	+2-3% better
Q5_K_M	2.7GB	RECOMMENDED	~7.5GB RAM	+2-4% better

💡 Quantization Guide:

IQ4_XS - Smallest size with imatrix benefits
IQ4_NL - Optimized for natural language tasks (math word problems!)
Q4_K_M - Best balance of size and quality improvement
Q5_K_M - Recommended choice for most users - excellent quality retention

🚀 Quick Start

Using llama.cpp

# Download your preferred imatrix quantization
wget https://huggingface.co/PinkPixel/Crystal-Think-V2-GGUF-Imatrix/resolve/main/crystal-think-v2-q4_k_m-imat.gguf

# Run with llama.cpp
./llama.cpp/main -m crystal-think-v2-q4_k_m-imat.gguf -p "Solve this step by step: If x + 2y = 10 and 2x - y = 5, find x and y." -n 512

Using llama-cpp-python

from llama_cpp import Llama

# Load the imatrix model
llm = Llama(
    model_path="crystal-think-v2-q5_k_m-imat.gguf",
    n_ctx=4096,  # Context length
    n_threads=8, # CPU threads
    verbose=False
)

# Mathematical reasoning example
prompt = """Solve this step by step:
A circular garden has a radius of 8 meters. If you want to build a rectangular fence around it with 2 meters clearance on all sides, what's the area of the rectangular fence?

Use <think></think> for your reasoning."""

response = llm(
    prompt,
    max_tokens=512,
    temperature=0.7,
    stop=["</SOLUTION>", "<|endoftext|>"]
)

print(response["choices"][0]["text"])

Using Ollama

# Create Modelfile
echo 'FROM ./crystal-think-v2-q5_k_m-imat.gguf' > Modelfile

# Create Ollama model
ollama create crystal-think-v2-imat -f Modelfile

# Run the model
ollama run crystal-think-v2-imat "What is the integral of sin(x)cos(x)?"

🎯 Enhanced Reasoning Format

Crystal Think V2 uses a structured reasoning approach, perfectly preserved with imatrix:

<think>
[Step-by-step reasoning process]
- Problem analysis and variable identification
- Mathematical equation setup
- Systematic solution steps
- Verification and checking
</think>

<SOLUTION>
[Final organized answer]
1) Clear results with explanations
2) Numerical values with proper units
3) Context and practical interpretation
</SOLUTION>

📊 Performance Benchmarks

Original Model Performance

Benchmark	Score	Improvement over Base
GSM8K	85.2%	+8.8%
MATH	42.1%	+10.4%
Algebra	78.9%	+13.7%
Geometry	71.3%	+12.5%
Code Math	82.6%	+13.5%

Imatrix vs Standard GGUF Comparison

Quantization	Standard GGUF	Imatrix GGUF	Improvement
Q4_K_M	~92% orig.	~95-97% orig.	+3-5%
Q5_K_M	~95% orig.	~97-99% orig.	+2-4%
IQ4_NL	N/A	~94-96% orig.	New format
IQ4_XS	N/A	~91-93% orig.	Smallest size

🎯 Why Imatrix is Better:

Smarter quantization - Preserves critical mathematical reasoning paths
Better accuracy - Maintains performance on complex multi-step problems
Consistent quality - Less degradation on edge cases and difficult problems

💻 Hardware Requirements

Minimum Requirements

Quantization	RAM	VRAM (GPU)	CPU
IQ4_XS	5.5GB	3.5GB	4 cores
Q4_K_S	6GB	4GB	4 cores
IQ4_NL	6GB	4GB	4 cores
Q4_K_M	6.5GB	4.5GB	4 cores
Q5_K_S	7GB	5GB	6 cores
Q5_K_M	7.5GB	5.5GB	6 cores

Recommended for Best Performance

CPU: Modern 8+ core processor (AMD Ryzen 7/Intel i7 or better)
RAM: 16GB+ system memory
GPU: 8GB+ VRAM (RTX 4070/RX 7800 XT or better for GPU acceleration)

🔧 Installation & Dependencies

llama.cpp (Latest Version Recommended)

git clone https://github.com/ggerganov/llama.cpp
cd llama.cpp
make
# For GPU support
make LLAMA_CUBLAS=1

llama-cpp-python

pip install llama-cpp-python
# For GPU support (CUDA)
CMAKE_ARGS="-DLLAMA_CUBLAS=on" pip install llama-cpp-python
# For GPU support (ROCm/AMD)
CMAKE_ARGS="-DLLAMA_HIPBLAS=on" pip install llama-cpp-python

Ollama

# Install Ollama
curl -fsSL https://ollama.com/install.sh | sh

📚 Advanced Usage Examples

Complex Mathematical Reasoning

Input: "A projectile is launched at 45° with initial velocity 50 m/s. Calculate the maximum height, range, and time of flight. Use g = 9.8 m/s²."

Expected: Detailed physics solution with kinematic equations

Multi-step Algebra

Input: "Solve the system of equations: 2x + 3y - z = 7, x - 2y + 4z = -3, 3x + y + 2z = 10"

Expected: Systematic solution using elimination or substitution

Calculus Problem

Input: "Find the area between the curves y = x² and y = 4x - x² from x = 0 to x = 4"

Expected: Step-by-step integration with proper setup

🔍 Quality Comparison Test

Test the imatrix advantage with this challenging problem:

Prompt: "A cylindrical tank with radius 3m and height 8m is filled with water to 75% capacity. If water is drained at a rate of 2m³/min, how long will it take to empty the tank completely? Also calculate the water level after 30 minutes of draining."

Expected Results:
- Initial volume calculation: π × 3² × 8 × 0.75 = 54π m³
- Time to empty: 27π minutes ≈ 84.8 minutes  
- Water level after 30 min: ~4.4 meters

Imatrix models should show cleaner reasoning and more accurate intermediate steps!

🔗 Related Links

🏠 Original Model: PinkPixel/Crystal-Think-V2
📖 Model Documentation: Crystal Think V2 README
🔧 Standard GGUF: Crystal Think V2 GGUF
🛠️ llama.cpp: GitHub Repository
🐍 llama-cpp-python: PyPI Package

⚠️ Limitations

Domain Focus: Optimized for mathematical reasoning; may be less effective for general conversation
Calibration Dependency: Imatrix quality depends on calibration data relevance
Language: Primarily trained on English mathematical content
Hardware Dependency: Performance varies significantly with hardware specifications

🧪 Technical Details

Imatrix Generation Process

Calibration Data: Used high-quality mathematical reasoning samples
Activation Analysis: Measured importance across all model layers
Precision Mapping: Applied higher precision to critical activations
Quality Validation: Tested on mathematical benchmarks

Recommended Use Cases

Mathematical tutoring systems
STEM education applications
Research and analysis tools
Competitive programming assistance
Physics and engineering calculations

🤝 Contributing

Found an issue with the imatrix quantizations or have suggestions for improvements? Please open an issue or reach out!

📧 Contact & Support

Developer: Pink Pixel
GitHub: https://github.com/pinkpixel-dev
Website: https://pinkpixel.dev
Email: [email protected]

🙏 Acknowledgments

Original Model: Crystal Think V2 by Pink Pixel
Base Model: Qwen/Qwen3-4B by Qwen Team
Quantization Tools: llama.cpp by Georgi Gerganov
Imatrix Technique: Advanced quantization methodology for preserving model quality
Training Dataset: NVIDIA OpenMathReasoning

Made with ❤️ by Pink Pixel ✨
"Dream it, Pixel it"

💡 Pro Tip: For the best mathematical reasoning experience, try the Q5_K_M-imat or IQ4_NL-imat variants - they offer excellent quality retention with the benefits of importance matrix optimization!

PinkPixel
/

Crystal-Think-V2-Imatrix-GGUF