TethysAI Vortex Reasoning
- Developed by: TethysAI
- License: apache-2.0
- Fine-tuned from: saishshinde15/TethysAI_Base_Reasoning
- Category: Experimental, Research
Introduction
TethysAI Vortex Reasoning is an experimental model that advances the structured reasoning capabilities pioneered by TethysAI Base Reasoning. While the Base Reasoning model utilized Generalized Reinforced Policy Optimization (GRPO) to enhance step-by-step logical thought processes similar to DeepSeek-R1, this model takes a different approach—eliminating GRPO and instead relying on high-end Supervised Fine-Tuning (SFT) techniques.
The core objective was to investigate whether deep reasoning and self-questioning behavior could emerge purely through SFT on high-quality datasets. The results were highly promising: the model successfully questions itself internally, improves reasoning depth, and consistently generates structured, logical responses.
Key Features
1️⃣ Advanced Reasoning Without GRPO
This model does not rely on GRPO yet achieves similar self-reflective thought processes, proving that structured reasoning can be induced through high-quality SFT alone.
2️⃣ Self-Questioning and Iterative Thinking
The model actively asks itself intermediate questions before answering, mimicking the deep reflection-based thought process of models like DeepSeek-R1. This leads to more reliable and well-structured responses.
3️⃣ High-Quality SFT on a Curated Dataset
To compensate for the lack of reinforcement learning, we used an extensive dataset tailored for deep reasoning. This dataset includes:
- Mathematical proofs & logical puzzles
- Complex multi-step problem-solving tasks
- Philosophical and ethical reasoning
- Scientific hypothesis evaluation
4️⃣ Implicit Use of <think>
and <answer>
Tokens
The model internally uses special reasoning markers (<think>
and <answer>
) to structure its responses, though these may not always be visible in the final output. This ensures a consistent and methodical approach to answering questions.
5️⃣ Part of the TethysAI Vortex Family
This model belongs to the TethysAI Vortex series, a collection of fine-tuned models pushing the boundaries of SFT-based reasoning without reinforcement learning.
Breakthrough Insights
Feature | Base Reasoning (GRPO) ✅ | Vortex Reasoning (SFT-Only) ✅ |
---|---|---|
Structured Thought Process | ✅ Yes (GRPO) | ✅ Yes (SFT) |
Self-Reflection & Questioning | ✅ Strong | ✅ Equally Strong |
GRPO-Free Optimization | ❌ No | ✅ Achieved via SFT |
Step-by-Step Problem Solving | ✅ Yes | ✅ Yes |
Use of <think> and <answer> |
✅ Explicit | ✅ Implicit (Internal Use) |
Key Takeaway: This experiment confirms that reinforcement learning is not the only pathway to advanced reasoning capabilities—with the right dataset and SFT strategies, models can self-reflect and logically deduce answers in a structured manner.
How to Use
Running with Transformers
from transformers import AutoTokenizer, AutoModelForCausalLM
import torch
# Load model & tokenizer
model_name = "saishshinde15/TethysAI_Vortex_Reasoning"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name).to("cuda")
# Prepare input prompt
messages = [
{"role": "system", "content": "You are an advanced AI assistant. Provide answers in a clear, step-by-step manner."},
{"role": "user", "content": "If x + 3 = 10, what is x?"}
]
# Apply chat template and tokenize
prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
input_ids = tokenizer(prompt, return_tensors="pt").input_ids.to("cuda")
# Generate response
outputs = model.generate(input_ids, max_new_tokens=512)
response = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(response)
- Downloads last month
- 55
Model tree for saishshinde15/TethysAI_Vortex_Reasoning
Base model
Qwen/Qwen2.5-3B