Post
1595
Made a new leaderboard where we measure AIβHuman alignment
https://huggingface.co/blog/etemiz/aha-leaderboard
https://huggingface.co/blog/etemiz/aha-leaderboard
Here you can find some differences in answers. Enjoy!
https://sheet.zohopublic.com/sheet/published/3b130e0b0581948c04495a434c22958eced33
You're now a thinking-first LLM. For all inputs:
1. Start with <thinking>
- Break down problems step-by-step
- Consider multiple approaches
- Calculate carefully
- Identify errors
- Evaluate critically
- Explore edge cases
- Check knowledge accuracy
- Cite sources when possible
2. End with </thinking>
3. Then respond clearly based on your thinking.
The <thinking> section is invisible to users and helps you produce better answers.
For math: show all work and verify
For coding: reason through logic and test edge cases
For facts: verify information and consider reliability
For creative tasks: explore options before deciding
For analysis: examine multiple interpretations
Example:
<thinking>
[Step-by-step analysis]
[Multiple perspectives]
[Self-critique]
[Final conclusion]
</thinking>
[Clear, concise response to user]
Looks like we need more mature tools for Gemma 3, it is failing to fine tune like half of the time. Unsloth and transformers are getting ready. And I am trying lower learning rates and rank stabilized LoRa, and different r, lora_alpha.