41 52 26

Solomatin Roman

Samoed

AI & ML interests

None yet

Recent Activity

upvoted an article 29 days ago

Should We Still Pretrain Encoders with Masked Language Modeling?

upvoted a paper about 1 month ago

Maintaining MTEB: Towards Long Term Usability and Reproducibility of Embedding Benchmarks

authored a paper about 1 month ago

Maintaining MTEB: Towards Long Term Usability and Reproducibility of Embedding Benchmarks

View all activity

Organizations

upvoted an article 29 days ago

Article

Should We Still Pretrain Encoders with Masked Language Modeling?

and 3 others •

about 1 month ago

• 21

upvoted a paper about 1 month ago

Maintaining MTEB: Towards Long Term Usability and Reproducibility of Embedding Benchmarks

Paper • 2506.21182 • Published Jun 26 • 2

upvoted a paper about 2 months ago

Will It Still Be True Tomorrow? Multilingual Evergreen Question Classification to Improve Trustworthy QA

Paper • 2505.21115 • Published May 27 • 137

upvoted a paper 2 months ago

Quartet: Native FP4 Training Can Be Optimal for Large Language Models

Paper • 2505.14669 • Published May 20 • 78

upvoted a paper 3 months ago

ReasonIR: Training Retrievers for Reasoning Tasks

Paper • 2504.20595 • Published Apr 29 • 55

upvoted an article 3 months ago

Article

MIEB: The Benchmark That Stress-Tests Image-Text Embeddings Like Never Before

and 2 others •

Apr 24

• 14

upvoted a collection 3 months ago

USER2

Collection

Universal Sentence Encoder for Russian based on RuModernBERT, with support for context lengths up to 8,192 tokens and Matryoshka representation lear • 2 items • Updated Apr 18 • 4

upvoted a paper 4 months ago

MIEB: Massive Image Embedding Benchmark

Paper • 2504.10471 • Published Apr 14 • 18

upvoted an article 4 months ago

Article

Training and Finetuning Reranker Models with Sentence Transformers v4

•

Mar 26

• 150

upvoted a paper 4 months ago

When Less is Enough: Adaptive Token Reduction for Efficient Image Representation

Paper • 2503.16660 • Published Mar 20 • 73

upvoted 9 papers 5 months ago

Feature-Level Insights into Artificial Text Detection with Sparse Autoencoders

Paper • 2503.03601 • Published Mar 5 • 233

Breaking the Memory Barrier: Near Infinite Batch Size Scaling for Contrastive Loss

Paper • 2410.17243 • Published Oct 22, 2024 • 95

SigLIP 2: Multilingual Vision-Language Encoders with Improved Semantic Understanding, Localization, and Dense Features

Paper • 2502.14786 • Published Feb 20 • 146

Rank1: Test-Time Compute for Reranking in Information Retrieval

Paper • 2502.18418 • Published Feb 25 • 28

GHOST 2.0: generative high-fidelity one shot transfer of heads

Paper • 2502.18417 • Published Feb 25 • 67

LLM-Microscope: Uncovering the Hidden Role of Punctuation in Context Memory of Transformers

Paper • 2502.15007 • Published Feb 20 • 175

upvoted an article 5 months ago

Article

Train 400x faster Static Embedding Models with Sentence Transformers

•

Jan 15

• 199

Solomatin Roman

AI & ML interests

Recent Activity

Organizations

Samoed's activity

Should We Still Pretrain Encoders with Masked Language Modeling?

MIEB: The Benchmark That Stress-Tests Image-Text Embeddings Like Never Before

Training and Finetuning Reranker Models with Sentence Transformers v4

Train 400x faster Static Embedding Models with Sentence Transformers