MedIT Solutions

company

Verified

https://meditsolutions.pl

Activity Feed Request to join this org

AI & ML interests

None defined yet.

Recent Activity

mkurman updated a collection 1 day ago

MedIT One

mkurman updated a collection 1 day ago

MedIT One

mkurman new activity 2 days ago

meditsolutions/medit-one-140M-9B-tokens-checkpoint:Question on meaning of parameter of this model

View all activity

meditsolutions's activity

mkurman

updated a collection 1 day ago

MedIT One

Collection

A compilation of MedIT One checkpoints • 2 items • Updated 1 day ago

mkurman

in meditsolutions/medit-one-140M-9B-tokens-checkpoint 2 days ago

Question on meaning of parameter of this model

#2 opened 2 days ago by

JLouisBiz

mkurman

posted an update 2 days ago

Post

672

Just released NVAMP Loss!

✔️ modification of the cross-entropy loss function designed specifically for training LLMs.
✔️ twist on the standard cross-entropy loss by emphasizing the importance of outlier prediction errors and dynamically normalizing token-level variance.
✔️ more stable and efficient training, leading to models that generalize better.

Check it out, give it a spin, and let me know what you think!

Licensed under the Apache 2.0 license and ready to use. Happy training! 🔥🤖

https://github.com/mkurman/nvamp-loss

mkurman

in meditsolutions/medit-one-140M-9B-tokens-checkpoint 3 days ago

Can't install

#1 opened 3 days ago by

JLouisBiz

mkurman

posted an update 3 days ago

Post

2318

MedIT One 140M Fifth checkpoint after 9B tokens
meditsolutions/medit-one-140M-9B-tokens-checkpoint

mkurman

updated a model 3 days ago

meditsolutions/medit-one-140M-9B-tokens-checkpoint

Updated 3 days ago • 7

mkurman

published a model 3 days ago

meditsolutions/medit-one-140M-9B-tokens-checkpoint

Updated 3 days ago • 7

mkurman

posted an update 5 days ago

Post

399

Test-time compute (TTC) scaling’s dope. Here’s my spin: Adaptive train-time compute scaling.

https://open.substack.com/pub/mkurman/p/adaptive-train-time-compute-scaling?r=7bzqr

What’s your take? Hit me!

mkurman

posted an update 6 days ago

Post

541

I have uploaded the third pre-training checkpoint after 6 billion tokens to demonstrate that the MedIT One architecture is trainable.

Give it some noise plz! Love u all :D

meditsolutions/medit-one-140M-6B-tokens-checkpoint

mkurman

updated a model 6 days ago

meditsolutions/medit-one-140M-6B-tokens-checkpoint

Updated 6 days ago • 33 • 1

mkurman

published a model 6 days ago

meditsolutions/medit-one-140M-6B-tokens-checkpoint

Updated 6 days ago • 33 • 1

mkurman

posted an update 8 days ago

Post

3631

Introducing a new architecture, MedIT One – a single-token transformer with LSTM-like recurrence.

It is extremely fast in training and inference, but we lack funding for large-scale training. Enjoy 🍓

https://github.com/MedITSolutionsKurman/medit-one

mkurman

posted an update 25 days ago

Post

2039

I've been working on something cool: a GRPO with an LLM evaluator that can also perform SFT on the feedback data - if you want. Check it out 😊

Any 🌟are more than welcome 🤗

https://github.com/mkurman/grpo-llm-evaluator

mkurman

posted an update about 1 month ago

Post

1587

Blurred-Thoughts Supervised-Finetuning 🙈

After hours of working with GitHub Copilot to organize the code, I'm keen to announce the release of Blurred Thoughts Supervised-Finetuning (BT-SFT), a new method for fine-tuning LLMs to produce more diverse and creative responses.

BT-SFT introduces:
✅ Smart tokenization method randomly masks tokens within <think> ... </think> tags, promoting the model to generate diverse responses that align better with its probability distribution instead of memorizing the thought process from distilled data.
✅ Reward function that ensures responses are well-structured.

Explore and contribute to the project available in my GitHub repository:
https://github.com/mkurman/blurred-thoughts-SFT

Keep me updated on your experiments with BT-SFT! 🐐

mkurman

posted an update about 1 month ago

Post

2062

Blurred-Thoughts Supervised Fine-Tuning (BT-SFT) 🤖

Can we teach a model to think completely on its own without reinforcement learning? Actually, yes.

We can do straightforward supervised fine-tuning using a relatively simple trick: blurring a part of CoT thoughts. But why is this effective?

We observed that various models differ in their thinking processes, and fine-tuning one model on another model’s thoughts (CoT) can sometimes be inefficient—often resulting in the model simply memorizing reasoning rather than learning how to actually think.

I discovered that this process can still be efficient if we clearly indicate when the model should start and stop thinking and uncover only a part of CoT and the expected answer, blurring the other part of CoT. This approach allows the model to learn only a portion of the thought process while still arriving at an expected answer.

To demonstrate this, you can watch my experimental BT-SFT on meditsolutions/Llama-3.2-SUN-2.5B-chat model, which was fine-tuned on 151 million tokens from the Magpie-Align/Magpie-Reasoning-V2-250K-CoT-Deepseek-R1-Llama-70B dataset.

Enjoy! 🚀

PS. If you were curious enough to read this, leave me a comment. It's always nice to chat with open-minded and intelligent ppl.

3 replies

mkurman

posted an update about 1 month ago

Post

2765

Ok, my 14B DeepSeek R1 merge with Qwen2.5 1M is really hot right now—it's got 2.6k downloads! It's sitting pretty as the top trending model on the third page. 🔥

Check it out if you haven't already!
mkurman/Qwen2.5-14B-DeepSeek-R1-1M

11 replies

mkurman

posted an update about 1 month ago

Post

1918

I’ve simplified things for the AI OS community!

Check out Qwen-2.5-14B-DeepSeek-R1-1M! This one's a cool blend of the latest Qwen 2.5 with 14 billion parameters and has a massive 1 million token context window. It also comes with the DeepSeek R1 version of the Qwen 2.5 14B base model.

Enjoy! 🚀

mkurman/Qwen2.5-14B-DeepSeek-R1-1M

mkurman

posted an update about 2 months ago

Post

1280

ReasonFlow 🧠

Are you fascinated by reasoning models? If so, you won't want to miss my latest project! I've implemented multiple path generations to supercharge the reasoning capabilities of O1-like models. Explore how this work can elevate your model in complex reasoning tasks!

https://github.com/mkurman/ReasonFlow

Use it with:
mkurman/phi4-MedIT-10B-o1
- or -
mkurman/llama-3.2-MEDIT-3B-o1

1 reply

mkurman

posted an update 2 months ago

Post

1908

I kindly invite you to try my experimental Llama 3.2 3B with o1-like thinking.

It utilizes Thoughts when needed, so don't be surprised when it's not. It also has a minor bug that requires further fine-tuning (sometimes it starts with the <|python_tag|> instead of <Thought>).

Enjoy!

Give some likes and whatever to make me feel better and motivated to keep going 😂

mkurman/llama-3.2-MEDIT-3B-o1

AI & ML interests

Recent Activity

Team members 2

meditsolutions's activity

Question on meaning of parameter of this model

Can't install