π Table of Contents
- Model Overview
- Key Capabilities
- Training Details
- Dataset Information
- Usage Guide
- Benchmarks
- Limitations
- Acknowledgments
- Citation
π Model Overview
Manticore-32B is a specialized fine-tuned version of Qwen3-32B, engineered to excel at complex reasoning tasks through intensive training on high-quality synthetic data. Developed by Daemontatox, this model combines the raw power of Qwen3 with targeted optimization for step-by-step problem solving across multiple domains.
Base Model: unsloth/qwen3-32b-unsloth
π Key Capabilities
Manticore-32B demonstrates exceptional performance in:
- Mathematical Reasoning: Complex problem solving with detailed step-by-step explanations
- Logical Deduction: Ability to handle intricate puzzles and logical problems
- Code Generation: Writing efficient, well-documented code across multiple languages
- Chain-of-Thought Reasoning: Breaking down complex problems into manageable steps
- Multi-step Problem Solving: Maintaining coherence across extended reasoning chains
βοΈ Training Details
- Framework: Fine-tuned using TRL + LoRA with Unsloth acceleration techniques
- Optimization: Quantized for efficient inference with 4-bit precision (BNB-4bit)
- Training Process:
- Custom fine-tuning across ~1 million samples
- Specific focus on multi-step reasoning tasks
- Progressive learning rate scheduling for optimal convergence
- Hardware: Single-node A100 80GB GPU setup
- Training Objective: Enhance multi-domain reasoning capabilities while maintaining computational efficiency
π Dataset Information
The model was trained on a carefully curated combination of high-quality reasoning datasets:
Dataset | Size | Focus Area | Content Type |
---|---|---|---|
OpenThoughts2-1M | ~1.1M examples | General reasoning | Multi-turn conversations, step-by-step solutions |
OpenR1-Math-220k | 220K examples | Mathematical reasoning | Problem statements with detailed solutions |
OpenMathReasoning | Supplementary | Advanced mathematics | University-level math problems |
These datasets were processed and filtered using Curator Viewer to ensure the highest quality training examples.
π Usage Guide
Quick Start
from transformers import pipeline
# Initialize the pipeline with the model
pipe = pipeline("text-generation",
model="Daemontatox/Manticore-32B",
torch_dtype="auto")
# Basic chat format
messages = [
{"role": "user", "content": "Can you solve this math problem step by step? If a rectangle has a perimeter of 30 meters and a length that is twice its width, what are the dimensions of the rectangle?"}
]
# Generate response
response = pipe(messages,
max_new_tokens=512,
do_sample=True,
temperature=0.7,
top_p=0.95)
print(response[0]["generated_text"])
Advanced Usage
For more control over generation parameters and to utilize advanced features:
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer
# Load model and tokenizer
tokenizer = AutoTokenizer.from_pretrained("Daemontatox/Manticore-32B")
model = AutoModelForCausalLM.from_pretrained(
"Daemontatox/Manticore-32B",
torch_dtype=torch.bfloat16,
device_map="auto",
load_in_4bit=True
)
# Format messages in chat template
messages = [
{"role": "system", "content": "You are Manticore-32B, an AI assistant specialized in reasoning and problem-solving. Always show your work step-by-step when tackling problems."},
{"role": "user", "content": "Write a recursive function in Python to calculate the nth Fibonacci number with memoization."}
]
# Create prompt using chat template
prompt = tokenizer.apply_chat_template(messages, tokenize=False)
# Generate with more control
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
outputs = model.generate(
inputs.input_ids,
max_new_tokens=1024,
do_sample=True,
temperature=0.7,
top_p=0.95,
top_k=40,
repetition_penalty=1.1
)
# Decode and print result
response = tokenizer.decode(outputs[0][inputs.input_ids.shape[1]:], skip_special_tokens=True)
print(response)
Using with Unsloth for Even Faster Inference
from unsloth import FastLanguageModel
import torch
# Load with Unsloth for optimized inference
model, tokenizer = FastLanguageModel.from_pretrained(
"Daemontatox/Manticore-32B",
dtype=torch.bfloat16,
load_in_4bit=True,
token="your_huggingface_token" # Optional
)
# Create prompt
messages = [
{"role": "user", "content": "Explain the concept of computational complexity and give examples of O(1), O(n), and O(nΒ²) algorithms."}
]
prompt = tokenizer.apply_chat_template(messages, tokenize=False)
# Generate
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
outputs = model.generate(
inputs.input_ids,
max_new_tokens=768,
temperature=0.7
)
# Decode
response = tokenizer.decode(outputs[0][inputs.input_ids.shape[1]:], skip_special_tokens=True)
print(response)
π Benchmarks
Manticore-32B demonstrates strong performance across multiple reasoning benchmarks:
Benchmark | Score | Base Model Score | Improvement |
---|---|---|---|
GSM8K | 78.2% | 71.5% | +6.7% |
MATH | 42.5% | 37.8% | +4.7% |
HumanEval | 75.6% | 71.3% | +4.3% |
BBH | 69.3% | 64.8% | +4.5% |
Note: These benchmarks reflect zero-shot performance with temperature=0.0
β οΈ Limitations
Despite its strengths, users should be aware of the following limitations:
- Language Support: Primarily optimized for English; performance degrades significantly for other languages
- Factual Accuracy: While reasoning skills are enhanced, the model may still hallucinate factual information
- Domain Knowledge: Specialized knowledge outside common domains may be limited or incorrect
- Context Window: Default context window is inherited from Qwen3-32B (128K tokens)
- Bias: Inherits potential biases from base model and synthetic training data
π Acknowledgments
This model builds upon the exceptional work of:
- Qwen Team for the base Qwen3-32B model
- Unsloth for optimization techniques
- OpenThoughts Team for their invaluable dataset
π Citation
If you use this model in your research or applications, please cite:
@misc{daemontatox2025manticore,
author = {Daemontatox},
title = {Manticore-32B: A Fine-tuned Language Model for Advanced Reasoning},
year = {2025},
publisher = {HuggingFace},
howpublished = {\url{https://huggingface.co/Daemontatox/Manticore-32B}}
}
- Downloads last month
- 73