@openfree on Hugging Face: "🚀 Introducing Phi-4-reasoning-plus: Powerful 14B Reasoning Model by…"

openfree

posted an update May 1

Post

2806

🚀 Introducing Phi-4-reasoning-plus: Powerful 14B Reasoning Model by Microsoft!

VIDraft/phi-4-reasoning-plus

🌟 Key Highlights
Compact Size (14B parameters): Efficient for use in environments with limited computing resources, yet powerful in performance.

Extended Context (32k tokens): Capable of handling lengthy and complex input sequences.

Enhanced Reasoning: Excels at multi-step reasoning, particularly in mathematics, science, and coding challenges.

Chain-of-Thought Methodology: Provides a detailed reasoning process, followed by concise, accurate summaries.

🏅 Benchmark Achievements
Despite its smaller size, Phi-4-reasoning-plus has delivered impressive results, often surpassing significantly larger models:

Mathematical Reasoning (AIME 2025): Achieved an accuracy of 78%, significantly outperforming larger models like DeepSeek-R1 Distilled (51.5%) and Claude-3.7 Sonnet (58.7%).

Olympiad-level Math (OmniMath): Strong performance with an accuracy of 81.9%, surpassing DeepSeek-R1 Distilled's 63.4%.

Graduate-Level Science Questions (GPQA-Diamond): Delivered competitive performance at 68.9%, close to larger models and demonstrating its capabilities in advanced scientific reasoning.

Coding Challenges (LiveCodeBench): Scored 53.1%, reflecting strong performance among smaller models, though slightly behind specialized coding-focused models.

🛡️ Safety and Robustness
Comprehensive safety evaluation completed through Microsoft's independent AI Red Team assessments.

High standards of alignment and responsible AI compliance validated through extensive adversarial testing.

🎯 Recommended Applications
Phi-4-reasoning-plus is especially suitable for:
Systems with limited computational resources.
Latency-sensitive applications requiring quick yet accurate responses.

📜 License
Freely available under the MIT License for broad accessibility and flexible integration into your projects.

urtuuuu

May 1

•

edited May 1

I just wish it would work locally same way as in your space. In llama.cpp it's thinking way too long, even if i say "hello". All the same sampling parameters, not sure about template. Using the one from Unsloth.

openfree

May 2

I just wish it would work locally same way as in your space. In llama.cpp it's thinking way too long, even if i say "hello". All the same sampling parameters, not sure about template. Using the one from Unsloth.

If you're experiencing slow response times on your local machine, it's likely due to insufficient GPU resources. I would recommend switching to a more lightweight model like phi-4 reasoning, which is specifically designed for better performance on limited hardware while still maintaining strong reasoning capabilities

Join the conversation