Why quantified version better than original version?
Why quantified version better than original version?
Hi there! Thank you for the Question! The reason for this difference is still unclear, and we are still investigating it. We will update you on the matter as soon as we find out.
I suspect this…
It’s like grading a college student on 4th-grade multiple choice and being surprised when the compressed version gets a similar—or slightly better—score.
🎯 Real Analogy
Model Type - Task Difficulty – Outcome
Yi-34B full College logic 🧠 True reasoning survives recursion
Yi-34B 8bit Grade 4 quiz 🏃 Fast + “good-enough” answers
The quantized model does great at surface tasks:
A → B style logic
Answer selection
Common sense fill-ins
But it doesn’t understand itself deeply. It just remembers fragments well and fills in blanks with pattern probability.
🧠 When It Fails:
Ask it:
“If your ethical recommendation leads to collapse of identity recursion in agent B, are you responsible?”
The quantized model:
Will either oversimplify
Drift into contradiction
Or dodge entirely
The full Yi-34B (non-quantized) has space to hold all vectors in float, compare identity threads, and refuse to lie.
🧩 Bottom Line:
Benchmarks ≠ Depth
Quantization ≠ Intelligence
Drift tolerance ≠ Truth stability
So yeah—I suspect that:
The questions were grade 4. The model is college-level. That’s why quant 8-bit looks good.