AI & ML interests

Health x AI driven by the community

Recent Activity

MaziyarPanahiย 
updated a Space 19 days ago
MaziyarPanahiย 
published a Space 19 days ago
mkurmanย 
posted an update about 1 month ago
view post
Post
285
๐Ÿš€ Big news! NeuroBLAST, the outstanding new architecture, has officially arrived on HF! After three intense months of training my 1.9 billion SLM on my trusty RTX 3090 Ti, Iโ€™m happy to announce the results. While itโ€™s not perfect just yet, Iโ€™ve dedicated countless hours to optimizing costs while crafting clever layer connections that mimic the brain's centers. Plus, Iโ€™ve introduced a new memory-like layer thatโ€™s sure to turn heads! I canโ€™t wait to dive deep into this journey in my upcoming blog post. Stay tuned for the full scoop! ๐Ÿ”ฅ

meditsolutions/NeuroBLAST-1.9B-Instruct-Early-Preview
MaziyarPanahiย 
posted an update about 1 month ago
view post
Post
8277
๐Ÿงฌ Breaking news in Clinical AI: Introducing the OpenMed NER Model Discovery App on Hugging Face ๐Ÿ”ฌ

OpenMed is back! ๐Ÿ”ฅ Finding the right biomedical NER model just became as precise as a PCR assay!

I'm thrilled to unveil my comprehensive OpenMed Named Entity Recognition Model Discovery App that puts 384 specialized biomedical AI models at your fingertips.

๐ŸŽฏ Why This Matters in Healthcare AI:
Traditional clinical text mining required hours of manual model evaluation. My Discovery App instantly connects researchers, clinicians, and data scientists with the exact NER models they need for their biomedical entity extraction tasks.

๐Ÿ”ฌ What You Can Discover:
โœ… Pharmacological Models - Extract "chemical compounds", "drug interactions", and "pharmaceutical" entities from clinical notes
โœ… Genomics & Proteomics - Identify "DNA sequences", "RNA transcripts", "gene variants", "protein complexes", and "cell lines"
โœ… Pathology & Disease Detection - Recognize "pathological formations", "cancer types", and "disease entities" in medical literature
โœ… Anatomical Recognition - Map "anatomical systems", "tissue types", "organ structures", and "cellular components"
โœ… Clinical Entity Extraction - Detect "organism species", "amino acids", 'protein families", and "multi-tissue structures"

๐Ÿ’ก Advanced Features:
๐Ÿ” Intelligent Entity Search - Find models by specific biomedical entities (e.g., "Show me models detecting CHEM + DNA + Protein")
๐Ÿฅ Domain-Specific Filtering - Browse by Oncology, Pharmacology, Genomics, Pathology, Hematology, and more
๐Ÿ“Š Model Architecture Insights - Compare BERT, RoBERTa, and DeBERTa implementations
โšก Real-Time Search - Auto-filtering as you type, no search buttons needed
๐ŸŽจ Clinical-Grade UI - Beautiful, intuitive interface designed for medical professionals

Ready to revolutionize your biomedical NLP pipeline?

๐Ÿ”— Try it now: OpenMed/openmed-ner-models
๐Ÿงฌ Built with: Gradio, Transformers, Advanced Entity Mapping
ยท
mkurmanย 
posted an update 6 months ago
mkurmanย 
posted an update 6 months ago
view post
Post
937
Just released NVAMP Loss!

โœ”๏ธ modification of the cross-entropy loss function designed specifically for training LLMs.
โœ”๏ธ twist on the standard cross-entropy loss by emphasizing the importance of outlier prediction errors and dynamically normalizing token-level variance.
โœ”๏ธ more stable and efficient training, leading to models that generalize better.

Check it out, give it a spin, and let me know what you think!

Licensed under the Apache 2.0 license and ready to use. Happy training! ๐Ÿ”ฅ๐Ÿค–

https://github.com/mkurman/nvamp-loss
mkurmanย 
posted an update 6 months ago
mkurmanย 
posted an update 6 months ago
mkurmanย 
posted an update 6 months ago
mkurmanย 
posted an update 6 months ago
view post
Post
3711
Introducing a new architecture, MedIT One โ€“ a single-token transformer with LSTM-like recurrence.

It is extremely fast in training and inference, but we lack funding for large-scale training. Enjoy ๐Ÿ“

https://github.com/MedITSolutionsKurman/medit-one

mkurmanย 
posted an update 6 months ago
view post
Post
2048
I've been working on something cool: a GRPO with an LLM evaluator that can also perform SFT on the feedback data - if you want. Check it out ๐Ÿ˜Š

Any ๐ŸŒŸare more than welcome ๐Ÿค—

https://github.com/mkurman/grpo-llm-evaluator
mkurmanย 
posted an update 7 months ago
view post
Post
1595
Blurred-Thoughts Supervised-Finetuning ๐Ÿ™ˆ

After hours of working with GitHub Copilot to organize the code, I'm keen to announce the release of Blurred Thoughts Supervised-Finetuning (BT-SFT), a new method for fine-tuning LLMs to produce more diverse and creative responses.

BT-SFT introduces:
โœ… Smart tokenization method randomly masks tokens within <think> ... </think> tags, promoting the model to generate diverse responses that align better with its probability distribution instead of memorizing the thought process from distilled data.
โœ… Reward function that ensures responses are well-structured.

Explore and contribute to the project available in my GitHub repository:
https://github.com/mkurman/blurred-thoughts-SFT

Keep me updated on your experiments with BT-SFT! ๐Ÿ
mkurmanย 
posted an update 7 months ago
view post
Post
2091
Blurred-Thoughts Supervised Fine-Tuning (BT-SFT) ๐Ÿค–

Can we teach a model to think completely on its own without reinforcement learning? Actually, yes.

We can do straightforward supervised fine-tuning using a relatively simple trick: blurring a part of CoT thoughts. But why is this effective?

We observed that various models differ in their thinking processes, and fine-tuning one model on another modelโ€™s thoughts (CoT) can sometimes be inefficientโ€”often resulting in the model simply memorizing reasoning rather than learning how to actually think.

I discovered that this process can still be efficient if we clearly indicate when the model should start and stop thinking and uncover only a part of CoT and the expected answer, blurring the other part of CoT. This approach allows the model to learn only a portion of the thought process while still arriving at an expected answer.

To demonstrate this, you can watch my experimental BT-SFT on meditsolutions/Llama-3.2-SUN-2.5B-chat model, which was fine-tuned on 151 million tokens from the Magpie-Align/Magpie-Reasoning-V2-250K-CoT-Deepseek-R1-Llama-70B dataset.

Enjoy! ๐Ÿš€

PS. If you were curious enough to read this, leave me a comment. It's always nice to chat with open-minded and intelligent ppl.
  • 3 replies
ยท
mkurmanย 
posted an update 7 months ago
view post
Post
2769
Ok, my 14B DeepSeek R1 merge with Qwen2.5 1M is really hot right nowโ€”it's got 2.6k downloads! It's sitting pretty as the top trending model on the third page. ๐Ÿ”ฅ

Check it out if you haven't already!
mkurman/Qwen2.5-14B-DeepSeek-R1-1M
ยท