Hugging Face
Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up
1
5
2
P.M.SALMAN KHAN
salmankhanpm
Follow
0 followers
Β·
16 following
https://salmankhanpm.co
SALMANKHANPM
AI & ML interests
None yet
Recent Activity
reacted
to
nicolay-r
's
post
with π₯
3 days ago
π’ The LLaMA-3.1-8B distilled 8B version of the R1 DeepSeek AI is available besides the one based on Qwen π Notebook for using it in reasoning over series of data π§ : https://github.com/nicolay-r/nlp-thirdgate/blob/master/tutorials/llm_deep_seek_7b_distill_llama3.ipynb Loading using the pipeline API of the transformers library: https://github.com/nicolay-r/nlp-thirdgate/blob/master/llm/transformers_llama.py π‘ GPU Usage: 12.3 GB (FP16/FP32 mode) which is suitable for T4. (a 1.5 GB less than Qwen-distilled version) π Perfomance: T4 instance: ~0.19 tokens/sec (FP32 mode) and (FP16 mode) ~0.22-0.30 tokens/sec. Is it should be that slow? π€ Model name: https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Llama-8B β Framework: https://github.com/nicolay-r/bulk-chain π Notebooks and models hub: https://github.com/nicolay-r/nlp-thirdgate
reacted
to
nicolay-r
's
post
with π₯
5 days ago
π¨ MistralAI is back with the mistral small V3 model update and it is free! π https://docs.mistral.ai/getting-started/models/models_overview/#free-models π Below is the the provider for reasoning over your dataset rows with custom schema π§ https://github.com/nicolay-r/nlp-thirdgate/blob/master/llm/mistralai_150.py My personal usage experience and findings: β οΈThe original API usage may constanly fail with the connection. To bypass this limitation, use `--attempts [COUNT]` to withstand connection loss while iterating through JSONL/CSV data (see π· below) π΅ It is actually: ~0.18 USD 1M tokens π Framework: https://github.com/nicolay-r/bulk-chain
upvoted
an
article
6 days ago
Open-R1: a fully open reproduction of DeepSeek-R1
View all activity
Organizations
spaces
1
Runtime error
AutoTrain Advanced
π
models
None public yet
datasets
None public yet