Min Cho

Wayyguu

AI & ML interests

Polynomial Regression with standard deviation channels.

Recent Activity

liked a model about 2 months ago

ReadyArt/Omega-Darkest_The-Broken-Tutu-GLM-32B

liked a model 2 months ago

mradermacher/Broken-Tutu-24B-i1-GGUF

liked a model 2 months ago

lodestones/Chroma

View all activity

Organizations

None yet

liked a model about 2 months ago

ReadyArt/Omega-Darkest_The-Broken-Tutu-GLM-32B

Text Generation • 33B • Updated May 25 • 36 • • 7

liked 2 models 2 months ago

mradermacher/Broken-Tutu-24B-i1-GGUF

24B • Updated 6 days ago • 2.34k • 12

lodestones/Chroma

Text-to-Image • Updated 43 minutes ago • 1.06k

liked a dataset 2 months ago

ZennyKenny/synthetic_vc_financial_decisions_reasoning_dataset

Viewer • Updated May 30 • 200 • 136 • 11

liked 3 models 2 months ago

reacted to Kseniase's post with 🔥 2 months ago

Post

4248

10 new Chain-of-Thoughts (CoT) methods

CoT has long been one of the hottest techniques in AI thanks to its effectiveness and compelling core idea: encouraging models to solve complex problems through explicit intermediate reasoning steps. But usually researchers modify original CoT approach, finding tips that further improve LLMs' reasoning. That's what we're going to talk about today.

Here's a list of 10 latest enhanced CoT approaches:

1. Chain-of-Defensive-Thought -> Chain-of-Defensive-Thought: Structured Reasoning Elicits Robustness in Large Language Models against Reference Corruption (2504.20769)
Provides a few structured, defensive reasoning exemplars to improve the robustness of LLMs

2. Hybrid-CoT -> AdaR1: From Long-CoT to Hybrid-CoT via Bi-Level Adaptive Reasoning Optimization (2504.21659)
Proposes using Adaptive Hybrid Reasoning Model (AdaR1) that combines Long- and Short-CoT, and applying bi-level preference training to select effective reasoning styles

3. Semantic-level and token-level CoT -> T2I-R1: Reinforcing Image Generation with Collaborative Semantic-level and Token-level CoT (2505.00703)
Introduces T2I-R1 text-to-image gen model, that uses semantic-level CoT for prompt planning and token-level CoT for pixel-level generation, while BiCoT-GRPO coordinates them both

4. Speculative CoT (SCoT) -> Efficient Reasoning for LLMs through Speculative Chain-of-Thought (2504.19095)
SCoT drafts multiple reasoning paths with a lightweight draft, selects the best, and uses the target model for correction - all this to reduce latency by 48–66%

5. Collaborative CoT (Co-CoT) -> Co-CoT: A Prompt-Based Framework for Collaborative Chain-of-Thought Reasoning (2504.17091)
Breaks reasoning into blocks that users can inspect, modify and re-run, promoting active engagement. An adaptation mechanism aligns outputs with diverse cognitive styles and user goals

6. XS-CoT -> Enhancing Non-Core Language Instruction-Following in Speech LLMs via Semi-Implicit Cross-Lingual CoT Reasoning (2504.20835)
It's a cross-lingual framework that integrates speech-to-text translation into reasoning, using a semi-implicit CoT approach to compress intermediate tokens. This improves non-core language responses by up to 45%

Read further in the comments 👇

If you liked this, also subscribe to the Turing Post -> https://www.turingpost.com/subscribe

1 reply

liked 4 models 2 months ago

mlabonne/NeuralDaredevil-8B-abliterated

Text Generation • 8B • Updated Aug 27, 2024 • 17.6k • • 218

mradermacher/Qwen3-30B-A7.5B-24-Grand-Brainstorm-GGUF

31B • Updated 6 days ago • 275 • 4

Mungert/DistilQwen2.5-DS3-0324-14B-GGUF

15B • Updated Jun 15 • 118 • 2

chuanli11/Llama-3.2-3B-Instruct-uncensored

Text Generation • 4B • Updated Oct 18, 2024 • 5.52k • 132

upvoted a collection 2 months ago

Qwen 3 / 2.5 Reasoning/Thinking REG + MOEs.

Collection

Qwen 3 / 2.5 Reasoning/Thinking models in both regular and MOE configuration built by me. Source code links also below too. • 77 items • Updated 5 days ago • 4

liked 3 models 2 months ago

ibm-granite/granite-4.0-tiny-preview

Text Generation • 7B • Updated May 6 • 43k • 138

mradermacher/Llama-3.1-8B-pre-trained-mixed-korean-articles-GGUF

8B • Updated Feb 9 • 89 • 1

seu5022/Qwen-2.5-Base-7b-SFT-Korean-Article-Dataset

Text Generation • 8B • Updated Oct 16, 2024 • 1.91k • 3

reacted to sometimesanotion's post with ❤️ 2 months ago

Post

1854

The capabilities of the new Qwen 3 models are fascinating, and I am watching that space!

My experience, however, is that context management is vastly more important with them. If you use a client with a typical session log with rolling compression, a Qwen 3 model will start to generate the same messages over and over. I don't think that detracts from them. They're optimized for a more advanced MCP environment. I honestly think the 8B is optimal for home use, given proper RAG/CAG.

In typical session chats, Lamarck and Chocolatine are still my daily drives. I worked hard to give Lamarck v0.7 a sprinkling of CoT from both DRT and Deepseek R1. While those models got surpassed on the leaderboards, in practice, I still really enjoy their output.

My projects are focusing on application and context management, because that's where the payoff in improved quality is right now. But should there be a mix of finetunes to make just the right mix of - my recipes are standing by.

liked 3 models 2 months ago

mradermacher/Uncensored_Mistral-7B_v0.3_Seaftensors-GGUF-GGUF

7B • Updated Mar 28 • 247 • 1

DavidAU/Mistral-MOE-4X7B-Dark-MultiVerse-Uncensored-Enhanced32-24B-gguf

Text Generation • 24B • Updated May 28 • 4.59k • 65

bartowski/uncensoredai_UncensoredLM-DeepSeek-R1-Distill-Qwen-14B-GGUF

Text Generation • 14B • Updated Feb 2 • 6.05k • 27

Min Cho

AI & ML interests

Recent Activity

Organizations

Wayyguu's activity