|
--- |
|
base_model: unsloth/qwen2.5-7b-instruct-bnb-4bit |
|
tags: |
|
- text-generation-inference |
|
- transformers |
|
- unsloth |
|
- qwen2 |
|
- trl |
|
- sft |
|
license: apache-2.0 |
|
language: |
|
- en |
|
datasets: |
|
- kenhktsui/longtalk-cot-v0.1 |
|
--- |
|
# kenhktsui/qwen2.5-7b-instruct-thinking-sft-merged |
|
|
|
With 1 epoch of SFT, it boosts processbench GSM-8k subset by 26%. However, it results in worse result in GSM-8k. |
|
It seems to be either 1) the metrics are not measuring correctly due to parsing or 2) sign of overfitting answer, as it is not able to critque reasoning step. |
|
|
|
||[kenhktsui/qwen2.5-7b-instruct-thinking-sft-merged](https://huggingface.co/kenhktsui/qwen2.5-7b-instruct-thinking-sft-merged)|[Qwen/Qwen2.5-7B-Instruct](https://huggingface.co/Qwen/Qwen2.5-7B-Instruct)| |
|
|---|---|---| |
|
|ProcessBench (GSM-8k)|**46.0**|36.5 (reported)| |
|
|GSM-8k|78.54|**84.23**| |
|
|
|
This needs to be investigated further. |
|
I have done inspection of how **model behavior changed** after SFT. As expected, the finetuned model thinks much longer, and often solve a problem with multiple approaches instead of one. |
|
|
|
### Quick Test 1: Is 9.11 greater than 9.9? |
|
<details> |
|
<summary> response from kenhktsui/qwen2.5-7b-instruct-thinking-sft-merged-gguf </summary> |
|
No, 9.11 is not greater than 9.9. Let's think about this step by step. |
|
|
|
First, it helps to understand that both numbers are decimal numbers, specifically they have two digits after the decimal point. So, 9.11 and 9.9 can be compared directly based on their |
|
values in each place value: whole number part and the tenths and hundredths parts. |
|
|
|
Let's break them down: |
|
|
|
- For 9.11: |
|
- The whole number part is 9. |
|
- The tenths place is 1 (which means one tenth). |
|
- The hundredths place is also 1 (one hundredth). |
|
|
|
- For 9.9: |
|
- The whole number part is also 9. |
|
- But here, the tenths place is 9 (nine tenths), and there's no hundredths digit after that; it's considered to have zero in the hundredths place because decimal numbers with fewer digits |
|
are filled with zeros. |
|
|
|
So, comparing them: |
|
|
|
- Both have the same whole number part: 9. |
|
- For the tenth place: |
|
- 1 vs. 9. |
|
|
|
Since 9 is greater than 1, this already tells us that 9.9 is larger than 9.11. But to be thorough, let's look at the hundredths places as well: |
|
|
|
- In 9.11: hundredths = 1 |
|
- In 9.9: hundredths = 0 |
|
|
|
Even though there are no digits in the hundredths place for 9.9, it's still considered zero because of the way decimals work. |
|
|
|
Therefore, overall, 9.11 < 9.9. |
|
|
|
Wait a minute, but I should make sure that I'm not missing anything here. Sometimes, people might think about adding or subtracting numbers differently, but in this case, since both numbers |
|
have the same whole number part and one has a larger tenths digit, it's clear which is larger. |
|
|
|
Another way to look at it is to consider them as percentages: |
|
|
|
- 9.11 = 911% |
|
- 9.9 = 990% |
|
|
|
But that might be confusing; in percentage terms, 990% is still greater than 911%. |
|
|
|
Alternatively, if we multiply both numbers by 100 to eliminate the decimal point: |
|
|
|
- 9.11 * 100 = 911 |
|
- 9.9 * 100 = 990 |
|
|
|
Again, 990 is greater than 911. |
|
|
|
So, no matter how we look at it, 9.9 is larger than 9.11. |
|
|
|
I think that covers the comparison thoroughly. There doesn't seem to be any trick or special case here; it's a straightforward decimal comparison once you understand place values. |
|
|
|
**Final Answer** |
|
|
|
\[ \boxed{9.11 < 9.9} \] |
|
</details> |
|
|
|
<details> |
|
<summary> response from response from Qwen/Qwen2.5-7B-Instruct-GGUF </summary> |
|
No, 9.11 is not greater than 9.9. When comparing decimal numbers, you can compare them digit by digit from left to right. In this case: |
|
|
|
- The whole number part of both numbers is the same (9). |
|
- Comparing the tenths place, we see that 1 in 9.11 is larger than 9 in 9.9. |
|
- Therefore, 9.11 > 9.9. |
|
|
|
So, 9.11 is greater than 9.9. |
|
</details> |
|
|
|
|
|
### Quick Test 2: How many letters R in the word strawberry? |
|
<details> |
|
<summary> response from kenhktsui/qwen2.5-7b-instruct-thinking-sft-merged-gguf </summary> |
|
Let's see. The word is "strawberry." I need to count how many 'R' letters are there in this word. |
|
|
|
First, let me write down the word again: s-t-r-a-w-b-e-r-r-y. |
|
|
|
Now, I'll go through each letter and check if it's an 'R'. |
|
|
|
Starting with the first letter: S - that's not R. |
|
|
|
Second letter: T - also not R. |
|
|
|
Third letter: R - yes, that's an 'R'. So, one so far. |
|
|
|
Fourth letter: A - nope. |
|
|
|
Fifth letter: W - still not R. |
|
|
|
Sixth letter: B - not R either. |
|
|
|
Seventh letter: E - no. |
|
|
|
Eighth letter: R - another 'R'! That makes two now. |
|
|
|
Ninth letter: B - no R here. |
|
|
|
Tenth letter: E - no R. |
|
|
|
Eleventh letter: R - yet again, an 'R'. Now it's three. |
|
|
|
Twelfth letter: Y - not R. |
|
|
|
So, counting all the way through, I've found three 'R' letters in "strawberry." |
|
|
|
Wait a minute, let me double-check. Maybe there are more 'R's that I missed. |
|
|
|
Looking at "s-t-r-a-w-b-e-r-r-y." Yes, I did find three: 3rd, 8th, and 11th positions. |
|
|
|
I think that's it. There don't seem to be any other 'R's in the word. |
|
|
|
Therefore, there are 3 letters R in the word "strawberry." |
|
|
|
**Final Answer** |
|
|
|
\[ \boxed{3} \] |
|
</details> |
|
<details> |
|
<summary> response from Qwen/Qwen2.5-7B-Instruct-GGUF </summary> |
|
The letter "R" appears 1 time in the word "strawberry". |
|
</details> |
|
|
|
### Try out in Ollama! |
|
``` |
|
ollama run hf.co/kenhktsui/qwen2.5-7b-instruct-thinking-sft-merged-gguf |
|
``` |
|
|
|
# Uploaded model |
|
|
|
- **Developed by:** kenhktsui |
|
- **License:** apache-2.0 |
|
- **Finetuned from model :** unsloth/qwen2.5-7b-instruct-bnb-4bit |
|
|
|
This qwen2 model was trained 2x faster with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library. |
|
|
|
[<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png" width="200"/>](https://github.com/unslothai/unsloth) |