aquif-3.5
The aquif-3.5 series is the successor to aquif-3, featuring a simplified naming scheme, expanded Mixture of Experts (MoE) options, and across-the-board performance improvements. This release streamlines model selection while delivering enhanced capabilities across reasoning, multilingual support, and general intelligence tasks.
Model Repository Links
Model | HuggingFace Repository |
---|---|
aquif-3.5-A0.6B-Preview | aquiffoo/aquif-3.5-A0.6B-Preview |
aquif-3.5-3B | aquiffoo/aquif-3.5-3B |
aquif-3.5-7B | aquiffoo/aquif-3.5-7B |
aquif-3.5-8B-Think | aquiffoo/aquif-3.5-8B-Think |
aquif-3.5-A4B-Think | aquiffoo/aquif-3.5-A4B-Think |
Model Overview
Model | Size (B) | Active Params (B) | Reasoning | MoE | Multilingual | MMLU | Context Window |
---|---|---|---|---|---|---|---|
aquif-3.5-A0.6B | 2.61 | 0.6 | โ | โ | โ | 60.5% | 4k |
aquif-3.5-3B | 2.67 | 2.67 | โ | โ | โ | 70.2% | 32k |
aquif-3.5-7B | 7.3 | 7.3 | โ | โ | โ | 78.5% | 16k |
aquif-3.5-8B-Think | 8.2 | 8.2 | โ | โ | โ | 81.1% | 40k |
aquif-3.5-A4B-Think | 12 | 4 | โ | โ | โ | 86.9% | 128k |
Model Details
aquif-3.5-A0.6B (Experimental MoE)
An experimental small-scale Mixture of Experts model designed for multilingual applications with minimal computational overhead. Despite its compact active parameter count, it demonstrates competitive performance against larger dense models.
Performance Comparison:
Metric | aquif-3.5 (2.6B A0.6B) | Qwen3 (0.8B) | LFM2 (0.7B) | aquif-3 (0.4B) |
---|---|---|---|---|
MMLU | 60.5 | 44.9 | 49.9 | 55.6 |
GPQA | 30.2 | 22.1 | 28.5 | 28.5 |
GSM8K | 50.7 | 36.5 | 46.4 | 52.1 |
HumanEval | 45.2 | 36.0 | 40.0 | 37.4 |
Average | 46.7 | 34.9 | 41.2 | 43.4 |
aquif-3.5-3B (State-of-the-Art Dense)
The new standard for small dense models, offering optimal performance-per-parameter efficiency for general-purpose applications.
Performance Comparison:
Metric | aquif-3.5 (2.7B) | EXAONE 3.5 (2.4B) | Qwen3 (4B) | Gemma 3 (4B) | Phi-4-mini (3.8B) | Apriel-5B-Instruct (4.8B) | aquif-3 (3.2B) |
---|---|---|---|---|---|---|---|
MMLU (General Knowledge) | 70.2 | 60.4 | 70.4 | 59.6 | 67.3 | 64.6 | 67.5 |
GPQA Diamond (Science) | 35.8 | 28.4 | 39.3 | 30.9 | 25.2 | 28.4 | 36.1 |
LiveCodeBench (Coding) | 23.1 | 12.5 | 21.3 | 11.2 | 10.4 | 11.6 | 15.4 |
IFEval (Instruction Following) | 78.9 | 73.6 | 71.2 | 80.2 | 68.6 | 80.8 | 78.9 |
AIME 2025 (Competition Math) | 13.4 | 4.5 | 9.8 | 12.7 | 5.3 | 4.3 | 9.6 |
Average | 44.3 | 35.9 | 42.4 | 38.9 | 35.4 | 37.9 | 41.5 |
aquif-3.5-7B (Multilingual Long Context)
A Qwen-based architecture optimized for multilingual applications with extended context capabilities, delivering state-of-the-art performance in its size class.
Performance Comparison:
Metric | aquif-3.5 (7.3B) | EXAONE 3.5 (7.8B) | Qwen3 (8.2B) | Gemma 3 (12B) | Llama 3.1 (8B) | Kanana 1.5 (8B) | aquif-3 (3.2B) |
---|---|---|---|---|---|---|---|
MMLU (General Knowledge) | 78.5 | 72.2 | 82.9 | 74.5 | 69.2 | 68.8 | 67.5 |
GPQA Diamond (Science) | 42.3 | 39.4 | 39.3 | 40.9 | 32.8 | 37.5 | 36.1 |
LiveCodeBench (Coding) | 21.3 | 18.0 | 23.9 | 13.7 | 10.8 | 16.5 | 15.4 |
IFEval (Instruction Following) | 85.6 | 82.6 | 85.4 | 80.2 | 75.0 | 80.1 | 78.9 |
AIME 2025 (Competition Math) | 23.4 | 18.3 | 20.9 | 18.8 | 2.7 | 13.4 | 9.6 |
Average | 50.2 | 46.1 | 50.4 | 45.6 | 38.1 | 43.3 | 41.5 |
aquif-3.5-8B-Think & aquif-3.5-A4B-Think (Reasoning Models)
Advanced reasoning-capable models designed for complex problem-solving tasks. The A4B variant leverages MoE architecture for enhanced efficiency while maintaining superior reasoning performance.
Performance Comparison:
Metric | aquif-3.5 (12B A4B) | aquif-3.5 (8B) | Qwen3 Thinking 2507 (31B A3B) | gpt-oss-20b (21B A4B) | Nemotron Nano v2 (9B) | Solar Pro 2 |
---|---|---|---|---|---|---|
MMLU-Pro | 78.5 | 78.1 | 80.5 | 73.6 | 74.2 | 80.5 |
GPQA Diamond | 70.8 | 66.8 | 70.7 | 61.7 | 64.0 | 68.7 |
AIME 2025 | 84.4 | 81.4 | 56.3 | 61.7 | 69.7 | 61.3 |
LiveCodeBench | 66.1 | 61.5 | 70.7 | 72.1 | 71.1 | 61.6 |
Humanity's Last Exam | 8.9 | 8.2 | 9.8 | 8.5 | 6.5 | 7.0 |
TAU-Bench v2 (avg) | 43.7 | 36.8 | 35.7 | 43.2 | 34.9 | 38.7 |
Average | 58.7 | 55.5 | 54.0 | 53.5 | 53.4 | 53.0 |
Key Improvements Over aquif-3
- Simplified Naming: Clear size-based nomenclature for easier model selection
- Enhanced MoE Support: Multiple MoE configurations across different model sizes
- Reasoning Capabilities: Dedicated thinking models for complex problem-solving
- Extended Context: Up to 128k context window for long-form applications
- Multilingual by Default: Native multilingual support across all variants
- Performance Gains: 5-15% improvement across benchmarks compared to aquif-3
Usage Recommendations
- aquif-3.5-A0.6B: Experimental applications, resource-constrained environments
- aquif-3.5-3B: General-purpose applications, balanced performance/efficiency
- aquif-3.5-7B: Multilingual applications, long-context tasks
- aquif-3.5-8B-Think: Complex reasoning, scientific analysis
- aquif-3.5-A4B-Think: Advanced reasoning with efficiency optimization
Technical Specifications
All models support:
- BF16 and FP16 precision
- Standard transformer architecture optimizations
- Efficient attention mechanisms
- Multi-head attention with optimized KV caching
Acknowledgements
- Qwen Team: Base architecture for 7B, 8B, and 12B-A4B models
- Meta Llama Team: Base architecture for 3B and 2.6B-A0.6B models
- Hugging Face: Model hosting infrastructure and training libraries
License
This project is released under the Apache 2.0 License. See LICENSE file for details.
Made in ๐ง๐ท
ยฉ 2025 aquif AI. All rights reserved.
- Downloads last month
- 15