noumenon-labs
/

Eqwenox-0.6B

Text Generation

text-generation-inference

Model card Files Files and versions

marcuscedricridia commited on May 7

Commit

0df2340

·

verified ·

1 Parent(s): 1178b4a

Update README.md

Files changed (1) hide show

README.md +3 -0

README.md CHANGED Viewed

@@ -122,6 +122,9 @@ This suggests that **reasoning and non-reasoning modes are not behaviorally isol
 A likely cause is shared parameter space. Despite mode control via `enable_thinking`, both modes tap into the same underlying weights and attention flows. The slight signal imbalance—reasoning traces being more structured or expressive—may have also contributed to stronger transfer from reasoning → non-reasoning than vice versa.
 ---
 ## Conclusion

 A likely cause is shared parameter space. Despite mode control via `enable_thinking`, both modes tap into the same underlying weights and attention flows. The slight signal imbalance—reasoning traces being more structured or expressive—may have also contributed to stronger transfer from reasoning → non-reasoning than vice versa.
+Yes, Qwen3 lets your model have 2 personalities (via enable_thinking or / commands), but they’re not cleanly separated. Fine-tuning one mode affects the other. Even 4 epochs showed behavioral bleed. With strong finetuning (2–4 epochs), you can mostly separate behaviors. Mode control works, just not perfectly. Not perfect, but good enough to steer two distinct modes with care.
 ---
 ## Conclusion