Nicolay Rusnachenko's picture

Nicolay Rusnachenko

nicolay-r

AI & ML interests

Information Retrieval・Medical Multimodal NLP (πŸ–Ό+πŸ“) Research Fellow @BU_Research・software developer http://arekit.io・PhD in NLP

Recent Activity

upvoted a collection 1 day ago
DeepSeek-R1
View all activity

Organizations

None yet

Posts 44

view post
Post
168
πŸ“’ For those who wish to launch distilled DeepSeek R1 for reasoning with schema, sharing the Google Colab notebook:
πŸ“™ https://github.com/nicolay-r/nlp-thirdgate/blob/master/tutorials/llm_deep_seek_7b_distill_colab.ipynb
This is a wrapper of the Qwen2 model hf provider via bulk-chain framework.
Model: deepseek-ai/DeepSeek-R1-Distill-Qwen-7B
GPU: T4 (15GB) is nearly enough in float32 mode.
πŸš€ To boost performance to load in bf16 (setup use_bf16=True)
🌟 Powered by bulk-chain: https://github.com/nicolay-r/bulk-chain
view post
Post
1082
πŸ“’ For those who wish to apply DeepSeek-R1 for handling tabular / streaming data using schema of prompts (CoT), the OpenRouter AI hosts API for accessing:
https://openrouter.ai/deepseek/deepseek-r1

The no-string option to quick start with using DeepSeek-R1 includes three steps:
βœ… OpenRouter provider: https://github.com/nicolay-r/nlp-thirdgate/blob/master/llm/open_router.py
βœ… Bulk-chain for infering data: https://github.com/nicolay-r/bulk-chain
βœ… Json Schema for Chain-of-Though reasoning (see screenshot πŸ“· below)

πŸ“Ί below is a screenshot of how to quick start the demo, in which you can test your schema for LLM responses. It would ask to type all the parameters first for completing the requests (which is text within this example).

πŸ“ƒ To apply it for JSONL/CSV data, you can use --src shell parameter for passing the related file

⏳ As for time, OpenRouter finds me relatively slow with 30~40 seconds per request

Models:
deepseek-ai/DeepSeek-R1

datasets

None public yet