Nicolay Rusnachenko's picture

Nicolay Rusnachenko

nicolay-r

AI & ML interests

Information Retrieval・Medical Multimodal NLP (πŸ–Ό+πŸ“) Research Fellow @BU_Research・software developer http://arekit.io・PhD in NLP

Recent Activity

reacted to singhsidhukuldeep's post with 🧠 7 minutes ago
Exciting News in AI: JinaAI Releases JINA-CLIP-v2! The team at Jina AI has just released a groundbreaking multilingual multimodal embedding model that's pushing the boundaries of text-image understanding. Here's why this is a big deal: πŸš€ Technical Highlights: - Dual encoder architecture combining a 561M parameter Jina XLM-RoBERTa text encoder and a 304M parameter EVA02-L14 vision encoder - Supports 89 languages with 8,192 token context length - Processes images up to 512Γ—512 pixels with 14Γ—14 patch size - Implements FlashAttention2 for text and xFormers for vision processing - Uses Matryoshka Representation Learning for efficient vector storage ⚑️ Under The Hood: - Multi-stage training process with progressive resolution scaling (224β†’384β†’512) - Contrastive learning using InfoNCE loss in both directions - Trained on massive multilingual dataset including 400M English and 400M multilingual image-caption pairs - Incorporates specialized datasets for document understanding, scientific graphs, and infographics - Uses hard negative mining with 7 negatives per positive sample πŸ“Š Performance: - Outperforms previous models on visual document retrieval (52.65% nDCG@5) - Achieves 89.73% image-to-text and 79.09% text-to-image retrieval on CLIP benchmark - Strong multilingual performance across 30 languages - Maintains performance even with 75% dimension reduction (256D vs 1024D) 🎯 Key Innovation: The model solves the long-standing challenge of unifying text-only and multi-modal retrieval systems while adding robust multilingual support. Perfect for building cross-lingual visual search systems! Kudos to the research team at Jina AI for this impressive advancement in multimodal AI!
View all activity

Organizations

None yet

Posts 35

view post
Post
1947
πŸ“’ So far I noticed that 🧠 reasoning with llm πŸ€– in English is tend to be more accurate than in other languages.
However, besides the GoogleTrans and other open transparent translators, I could not find one that could be easy to use solutions to avoid:
1.πŸ”΄ Third-party framework installation
2.πŸ”΄ Text chunking
3.πŸ”΄ support of meta-annotation like spans / objects / etc.

πŸ’Ž To cope problem of IR from non-english texts, I am happy to share the bulk-translate 0.25.0. 🎊

⭐ https://github.com/nicolay-r/bulk-translate

bulk-translate is a tiny Python 🐍 no-string framework that allows translate series of texts with the pre-annotated fixed-spans that are invariant for translator.

It supports πŸ‘¨β€πŸ’» API for quick data translation with (optionaly) annotated objects in texts (see figure below) in Python 🐍
I make it accessible as much as possible for RAG and / or LLM-powered app downstreams:
πŸ“˜ https://github.com/nicolay-r/bulk-translate/wiki

All you have to do is to provide iterator of texts, where each text:
1. βœ… String object
2. βœ… List of strings and nested lists that represent spans (value + any ID data).

πŸ€– By default I provide a wrapper over googletrans which you can override with your own πŸ”₯
https://github.com/nicolay-r/bulk-translate/blob/master/models/googletrans_310a.py
view post
Post
514
πŸ“’ If you're working in relation extraction / character network domain, then the following post would be relevant.
Excited to share the most recent milestone on releasing the ARElight 0.25.0 🎊

Core library: https://github.com/nicolay-r/ARElight
Server: https://github.com/nicolay-r/ARElight-server

πŸ”Ž What is ARElight? It represents Granular Viewer of Sentiments Between Entities in Massively Large Documents and Collections of Texts.
Shortly speaking, it allows to extract contexts with mentioned object pairs for the related prompting / classification.
In the slides below we illsutrate the ARElight appliation for sentiment classification between object pairs in context.

We exploit DeepPavlov NER modes + GoogleTranslate + BERT-based classifier in the demo. The bash script for launching the quick demo illustrates the application of these components.

The new update provide a series of new features:
βœ… SQlite support for storing all the extracted samples
βœ… Support of the enhanced GUI for content investigation.
βœ… Switch to external no-string projects for NER and Translator

Supplementiary materials:
πŸ“œ Paper: https://link.springer.com/chapter/10.1007/978-3-031-56069-9_23

datasets

None public yet