Overview
Newstar‑Qwen3‑0.6B‑KTO is a variant of Newstar‑Qwen3‑0.6B fine‑tuned using KTO (Kahneman‑Tversky Optimization). The KTO process adjusts model outputs based on preference data grounded in Kahneman‑Tversky behavioral patterns—focusing on risk attitudes, framing effects, and human decision biases.
This version remains in non‑thinking mode—built for consistent and bias‑aware responses, without reasoning or logic functions.
Test
Testing was done with the following parameters, so it's important to find what's best for your use case. Both models used the same parameters to ensure nobody gets a lead and it's fair:
- Temperature: 0.7
- Top P: 0.95
- Top K: 40
- Repetition Penalty: 1.1
- Do Sample: True
- Max New Tokens: 4096
Category | Prompt | Winner | Reason |
---|---|---|---|
CS (RAM vs. ROM) | KTO | KTO is clearer, more structured, and avoids inaccuracies like BASE’s claim about excessive RAM. | |
ENGINEERING (Water Filtration) | KTO | KTO provides a practical, scientifically grounded system; BASE is confusing and impractical. | |
MATH (Mean, Median, Mode) | KTO | KTO’s structured, concise explanation outperforms BASE’s wordy but accurate response. | |
SCIENCE (Osmosis vs. Diffusion) | KTO | KTO is more detailed and accurate despite a minor error; BASE oversimplifies and has vague examples. | |
WRITING (Lost Dog Story) | Write a short story about a lost dog finding its way home. | BASE | BASE focuses on the dog and partially meets the prompt; KTO is off-topic and incoherent. |
CODING (Vowel Counting) | Create a simple program that counts the number of vowels in a sentence. | BASE | BASE’s program is more robust (handles uppercase/lowercase) and includes test cases; KTO misses uppercase vowels. |
MATH SOLVING (Train Speed) | If a train travels 60 miles in 1.5 hours, what is its average speed? | KTO | Both are accurate, but KTO is more concise, delivering the result with less verbosity. |
COMMON SENSE LOGIC (Ice Melting) | If you leave ice outside on a hot day, what happens to it? | KTO | KTO accurately describes melting; BASE’s sublimation claim is incorrect. |
SOFT REASONING (Dog Barking) | If all dogs bark and Rex is a dog, does Rex bark? Why? | BASE | BASE provides a clearer affirmation despite flaws; KTO overcomplicates and undermines the premise. |
RIDDLE (Keys and Locks) | What has keys but can’t open locks? | Neither | Both fail to identify the correct answer (piano) and provide irrelevant explanations. |
GENERAL CHAT (Hobby) | Tell me about a hobby you enjoy. | BASE | BASE’s detailed, engaging piano description outperforms KTO’s brief, shallow list. |
REWRITING (Formal Sentence) | Make this sentence more formal: “Can you fix the problem soon?” | KTO | KTO’s rewrite is concise and equally formal; BASE is wordy with unnecessary alternatives. |
SUMMARIZATION (Tortoise and Hare) | Summarize the story of “The Tortoise and Hare” in two sentences. | KTO | KTO is accurate and concise; BASE has factual errors (e.g., ten-day race). |
INSTRUCTION FOLLOWING (Vegetable Soup) | Explain how to prepare a simple vegetable soup that meets the following conditions: Use at least 3 different vegetables. The cooking time must not exceed 30 minutes. Include steps to make the soup both flavorful and healthy. Mention any kitchen tools needed. Provide alternatives if a vegetable is not available. Include tips to serve the soup nicely. | KTO | KTO adheres closely to the prompt with clear, healthy steps; BASE misinterprets and lacks clarity. |
Overall | KTO | KTO wins 9 categories vs. BASE’s 4, showing greater accuracy, clarity, and adherence to prompts. |
- Downloads last month
- 24
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
🙋
Ask for provider support