P.M.SALMAN KHAN

salmankhanpm
Β·

AI & ML interests

None yet

Recent Activity

View all activity

Organizations

Hacktoberfest 2023's profile picture

salmankhanpm's activity

reacted to nicolay-r's post with πŸ”₯ 4 days ago
view post
Post
1558
πŸ“’ The LLaMA-3.1-8B distilled 8B version of the R1 DeepSeek AI is available besides the one based on Qwen

πŸ“™ Notebook for using it in reasoning over series of data 🧠 :
https://github.com/nicolay-r/nlp-thirdgate/blob/master/tutorials/llm_deep_seek_7b_distill_llama3.ipynb

Loading using the pipeline API of the transformers library:
https://github.com/nicolay-r/nlp-thirdgate/blob/master/llm/transformers_llama.py
🟑 GPU Usage: 12.3 GB (FP16/FP32 mode) which is suitable for T4. (a 1.5 GB less than Qwen-distilled version)
🐌 Perfomance: T4 instance: ~0.19 tokens/sec (FP32 mode) and (FP16 mode) ~0.22-0.30 tokens/sec. Is it should be that slow? πŸ€”
Model name: deepseek-ai/DeepSeek-R1-Distill-Llama-8B
⭐ Framework: https://github.com/nicolay-r/bulk-chain
🌌 Notebooks and models hub: https://github.com/nicolay-r/nlp-thirdgate
reacted to nicolay-r's post with πŸ”₯ 5 days ago
view post
Post
1282
🚨 MistralAI is back with the mistral small V3 model update and it is free! πŸ‘
https://docs.mistral.ai/getting-started/models/models_overview/#free-models

πŸš€ Below is the the provider for reasoning over your dataset rows with custom schema 🧠
https://github.com/nicolay-r/nlp-thirdgate/blob/master/llm/mistralai_150.py

My personal usage experience and findings:
⚠️The original API usage may constanly fail with the connection.
To bypass this limitation, use --attempts [COUNT] to withstand connection loss while iterating through JSONL/CSV data (see πŸ“· below)

πŸ’΅ It is actually: ~0.18 USD 1M tokens
🌟 Framework: https://github.com/nicolay-r/bulk-chain
upvoted an article 6 days ago
view article
Article

Open-R1: a fully open reproduction of DeepSeek-R1

β€’ 626
upvoted an article 7 days ago
view article
Article

Red-Teaming Large Language Models

β€’ 25
upvoted an article 20 days ago
updated a Space about 1 year ago