Pramudito

Ditot

AI & ML interests

None yet

Recent Activity

reacted to tomaarsen's post with 🔥 3 days ago

‼️Sentence Transformers v4.0 is out! You can now train and finetune reranker models with multi-GPU training, bf16 support, loss logging, callbacks & much more. I also prove that finetuning on your domain helps much more than you might think. 1️⃣ Reranker Training Refactor Reranker models can now be trained using an extensive trainer with a lot of powerful features: - MultiGPU Training (Data Parallelism (DP) and Distributed Data Parallelism (DDP)) - bf16 training support; loss logging - Evaluation datasets + evaluation loss - Improved callback support + an excellent Weights & Biases integration - Gradient checkpointing, gradient accumulation - Model card generation - Resuming from a training checkpoint without performance loss - Hyperparameter Optimization and much more! Read my detailed blogpost to learn about the components that make up this new training approach: https://huggingface.co/blog/train-reranker Notably, the release is fully backwards compatible: all deprecations are soft, meaning that they still work but emit a warning informing you how to upgrade. 2️⃣ New Reranker Losses - 11 new losses: - 2 traditional losses: BinaryCrossEntropy and CrossEntropy - 2 distillation losses: MSE and MarginMSE - 2 in-batch negatives losses: MNRL (a.k.a. InfoNCE) and CMNRL - 5 learning to rank losses: Lambda, p-ListMLE, ListNet, RankNet, ListMLE 3️⃣ New Reranker Documentation - New Training Overview, Loss Overview, API Reference docs - 5 new, 1 refactored training examples docs pages - 13 new, 6 refactored training scripts - Migration guides (2.x -> 3.x, 3.x -> 4.x) 4️⃣ Blogpost Alongside the release, I've written a blogpost where I finetune ModernBERT on a generic question-answer dataset. My finetunes easily outperform all general-purpose reranker models, even models 4x as big. Finetuning on your domain is definitely worth it: https://huggingface.co/blog/train-reranker See the full release notes here: https://github.com/UKPLab/sentence-transformers/releases/v4.0.1

replied to AdinaY's post 3 days ago

Let's check out the latest releases from the Chinese community in March! 👉 https://huggingface.co/collections/zh-ai-community/march-2025-releases-from-the-chinese-community-67c6b479ebb87abbdf8e2e76 ✨MLLM > R1 Omni by Alibaba Tongyi - 0.5B > Qwen2.5 Omni by Alibaba Qwen - 7B with apache2.0 🖼️Video > CogView-4 by ZhipuAI - Apacha2.0 > HunyuanVideo-I2V by TencentHunyuan > Open Sora2.0 - 11B with Apache2.0 > Stepvideo TI2V by StepFun AI - 30B with MIT license 🎵Audio > DiffDiffRhythm - Apache2.0 > Spark TTS by SparkAudio - 0.5B ⚡️Image/3D > Hunyuan3D 2mv/2mini (0.6B) by @TencentHunyuan > FlexWorld by ByteDance - MIT license > Qwen2.5-VL-32B-Instruct by Alibaba Qwen - Apache2.0 > Tripo SG (1.5B)/SF by VastAIResearch - MIT license > InfiniteYou by ByteDance > LHM by Alibaba AIGC team - Apache2.0 > Spatial LM by ManyCore 🧠Reasoning > QwQ-32B by Alibaba Qwen - Apache2.0 > Skywork R1V - 38B with MIT license > RWKV G1 by RWKV AI - 0.1B pure RNN reasoning model with Apache2.0 > Fin R1 by SUFE AIFLM Lab - financial reasoning 🔠LLM > DeepSeek v3 0324 by DeepSeek -MIT license > Babel by Alibaba DAMO - 9B/83B/25 languages

reacted to Yehor's post with 😎 3 days ago

Are you interesting in different runtimes for AI models? Check out IREE (iree.dev), it convert models to MLIR and then execute on different platforms. I have tested it in Rust on CPU and CUDA: https://github.com/egorsmkv/eerie-yolo11

View all activity

Organizations

None yet

Ditot's activity

reacted to tomaarsen's post with 🔥 3 days ago

Post

2098

‼️Sentence Transformers v4.0 is out! You can now train and finetune reranker models with multi-GPU training, bf16 support, loss logging, callbacks & much more. I also prove that finetuning on your domain helps much more than you might think.

1️⃣ Reranker Training Refactor
Reranker models can now be trained using an extensive trainer with a lot of powerful features:
- MultiGPU Training (Data Parallelism (DP) and Distributed Data Parallelism (DDP))
- bf16 training support; loss logging
- Evaluation datasets + evaluation loss
- Improved callback support + an excellent Weights & Biases integration
- Gradient checkpointing, gradient accumulation
- Model card generation
- Resuming from a training checkpoint without performance loss
- Hyperparameter Optimization
and much more!

Read my detailed blogpost to learn about the components that make up this new training approach: https://huggingface.co/blog/train-reranker
Notably, the release is fully backwards compatible: all deprecations are soft, meaning that they still work but emit a warning informing you how to upgrade.

2️⃣ New Reranker Losses
- 11 new losses:
- 2 traditional losses: BinaryCrossEntropy and CrossEntropy
- 2 distillation losses: MSE and MarginMSE
- 2 in-batch negatives losses: MNRL (a.k.a. InfoNCE) and CMNRL
- 5 learning to rank losses: Lambda, p-ListMLE, ListNet, RankNet, ListMLE

3️⃣ New Reranker Documentation
- New Training Overview, Loss Overview, API Reference docs
- 5 new, 1 refactored training examples docs pages
- 13 new, 6 refactored training scripts
- Migration guides (2.x -> 3.x, 3.x -> 4.x)

4️⃣ Blogpost
Alongside the release, I've written a blogpost where I finetune ModernBERT on a generic question-answer dataset. My finetunes easily outperform all general-purpose reranker models, even models 4x as big. Finetuning on your domain is definitely worth it: https://huggingface.co/blog/train-reranker

See the full release notes here: https://github.com/UKPLab/sentence-transformers/releases/v4.0.1

replied to AdinaY's post 3 days ago

Im going inn!

reacted to Yehor's post with 😎 3 days ago

Post

1983

Are you interesting in different runtimes for AI models?

Check out IREE (iree.dev), it convert models to MLIR and then execute on different platforms.

I have tested it in Rust on CPU and CUDA: https://github.com/egorsmkv/eerie-yolo11

replied to AdinaY's post 3 days ago

Nice and thankyouu

reacted to AdinaY's post with 🔥 3 days ago

Post

1738

AReal-Boba 🔥 a fully open RL Frameworks released by AntGroup, an affiliate company of Alibaba.
inclusionAI/areal-boba-67e9f3fa5aeb74b76dcf5f0a
✨ 7B/32B - Apache2.0
✨ Outperform on math reasoning
✨ Replicating QwQ-32B with 200 data under $200
✨ All-in-one: weights, datasets, code & tech report

1 reply

replied to hanzla's post 3 days ago

Great!

reacted to hanzla's post with 👍 3 days ago

Post

2856

Hi all,

Last week, I open sourced Free Search API. It allows sourcing results from top search engines (including google, bing) for free. It uses searxng instances for this purpose.

I was overwhelmed by community's response and I am glad for all the support and suggestions. So today, I have pushed several improvements that make this API more stable. These improvements include

1) Parallel scrapping of search results for faster response
2) Markdown formatting of search results
3) Prioritizing SearXNG instances that have faster google response time
4) Update/Get endpoints for searxng instances.

Github: https://github.com/HanzlaJavaid/Free-Search/tree/main

Try the deployed version: https://freesearch.replit.app/docs

I highly appreciate PRs, issues, stars, and any kind of feedback. Let's join hands, and make it real big!

4 replies