--- base_model: - saishshinde15/TethysAI_Base_Reasoning tags: - text-generation-inference - transformers - qwen2 - trl - reasoning - deepseekR1 - advanced-finetuning license: apache-2.0 language: - en pipeline_tag: text-generation --- # TBH.AI Vortex Reasoning - **Developed by:** TBH.AI - **License:** apache-2.0 - **Fine-tuned from:** [saishshinde15/TBH.AI_Base_Reasoning](https://huggingface.co/saishshinde15/TBH.AI_Base_Reasoning) - **Category:** Experimental, Research ## **Introduction** TethysAI Vortex Reasoning is an **experimental model** that advances the structured reasoning capabilities pioneered by [TBH.AI Base Reasoning](https://huggingface.co/saishshinde15/TethysAI_Base_Reasoning). While the Base Reasoning model utilized **Generalized Reinforced Policy Optimization (GRPO)** to enhance step-by-step logical thought processes similar to **DeepSeek-R1**, this model takes a different approach—**eliminating GRPO and instead relying on high-end Supervised Fine-Tuning (SFT) techniques**. The core objective was to investigate whether **deep reasoning and self-questioning behavior could emerge purely through SFT on high-quality datasets**. The results were highly promising: the model successfully **questions itself internally**, improves reasoning depth, and consistently generates structured, logical responses. --- ## **Key Features** ### **1️⃣ Advanced Reasoning Without GRPO** This model **does not rely on GRPO** yet **achieves similar self-reflective thought processes**, proving that structured reasoning can be induced through **high-quality SFT alone**. ### **2️⃣ Self-Questioning and Iterative Thinking** The model **actively asks itself intermediate questions before answering**, mimicking the deep **reflection-based thought process** of models like DeepSeek-R1. This leads to **more reliable** and **well-structured** responses. ### **3️⃣ High-Quality SFT on a Curated Dataset** To compensate for the lack of reinforcement learning, we used an **extensive dataset** tailored for deep reasoning. This dataset includes: - **Mathematical proofs & logical puzzles** - **Complex multi-step problem-solving tasks** - **Philosophical and ethical reasoning** - **Scientific hypothesis evaluation** ### **4️⃣ Implicit Use of `` and `` Tokens** The model internally uses **special reasoning markers** (`` and ``) to structure its responses, though these may not always be visible in the final output. This ensures a **consistent and methodical approach** to answering questions. ### **5️⃣ Part of the TethysAI Vortex Family** This model belongs to the **TBH.AI Vortex series**, a collection of fine-tuned models pushing the boundaries of **SFT-based reasoning without reinforcement learning**. --- ## **Breakthrough Insights** | Feature | Base Reasoning (GRPO) ✅ | Vortex Reasoning (SFT-Only) ✅ | |----------------------------------|------------------------|----------------------------| | Structured Thought Process | ✅ Yes (GRPO) | ✅ Yes (SFT) | | Self-Reflection & Questioning | ✅ Strong | ✅ Equally Strong | | GRPO-Free Optimization | ❌ No | ✅ Achieved via SFT | | Step-by-Step Problem Solving | ✅ Yes | ✅ Yes | | Use of `` and `` | ✅ Explicit | ✅ Implicit (Internal Use) | **Key Takeaway:** This experiment confirms that **reinforcement learning is not the only pathway to advanced reasoning capabilities**—with the right dataset and SFT strategies, models can **self-reflect and logically deduce answers** in a structured manner. --- ## **How to Use** ### **Running with Transformers** ```python from transformers import AutoTokenizer, AutoModelForCausalLM import torch # Load model & tokenizer model_name = "saishshinde15/TBH.AI_Vortex_Reasoning" tokenizer = AutoTokenizer.from_pretrained(model_name) model = AutoModelForCausalLM.from_pretrained(model_name).to("cuda") # Prepare input prompt messages = [ {"role": "system", "content": "You are an advanced AI assistant. Provide answers in a clear, step-by-step manner."}, {"role": "user", "content": "If x + 3 = 10, what is x?"} ] # Apply chat template and tokenize prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True) input_ids = tokenizer(prompt, return_tensors="pt").input_ids.to("cuda") # Generate response outputs = model.generate(input_ids, max_new_tokens=512) response = tokenizer.decode(outputs[0], skip_special_tokens=True) print(response) ```