Shahrukh Khan's picture

Shahrukh Khan

shahrukhx01

·

https://github.com/shahrukhx01

AI & ML interests

NLP

Recent Activity

updated a model 3 days ago

shahrukhx01/gradient-whisperer

liked a model 3 days ago

ResembleAI/chatterbox

upvoted a collection 9 days ago

View all activity

Organizations

shahrukhx01's activity

upvoted a collection 9 days ago

Falcon-H1

Falcon-H1 Family of Hybrid-Head Language Models, including 0.5B, 1.5B, 1.5B-Deep, 3B, 7B, and 34B (pretrained and instruction-tuned). • 37 items • Updated 11 days ago • 36

upvoted an article 12 days ago

Article

🥬 LettuceDetect Goes Multilingual: Fine-tuning EuroBERT on Synthetic Translations

By

and 1 other •

13 days ago

• 9

upvoted an article 27 days ago

Article

You could have designed state of the art positional encoding

By

•

Nov 25, 2024

• 280

upvoted 2 collections about 1 month ago

Deepseek Papers

Deepseek papers collection • 24 items • Updated 3 days ago • 248

Gemma 3 QAT

Quantization Aware Trained (QAT) Gemma 3 checkpoints. The model preserves similar quality as half precision while using 3x less memory • 15 items • Updated 2 days ago • 194

upvoted 4 collections about 2 months ago

Orpheus Multilingual Research Release

Beta Release of multilingual models. • 12 items • Updated Apr 10 • 85

Kimi-VL-A3B

Moonshot's efficient MoE VLMs, exceptional on agent, long-context, and thinking • 6 items • Updated Apr 12 • 65

Llama 4

Llama 4 release • 13 items • Updated Apr 29 • 518

Nomic Embed Multimodal

Multimodal models allowing you to search over interleaved text, PDFs, charts, and images! • 15 items • Updated Apr 7 • 20

upvoted a collection 2 months ago

Orpheus TTS

TTS Towards Human-Sounding Speech • 2 items • Updated Mar 18 • 64

upvoted 7 collections 3 months ago

Zonos-v0.1

3 items • Updated Feb 12 • 28

Ultravox v0.5

Ultravox is a multimodal Speech LLM built around different pretrained LLMs (frozen) and the whisper-large-v3-turbo (fine-tuned) backbone. • 3 items • Updated Feb 10 • 15

reranking series v2

V2 crispy rerank series • 2 items • Updated Mar 13 • 23

BD3-LMs

https://m-arriola.com/bd3lms/ • 4 items • Updated Apr 12 • 23

Gemma 3 Release

24 items • Updated 2 days ago • 377

Hallucination detection

Trained ModernBERT (base and large) for detection hallucinations in LLM responses. The models are trained as token classifications. • 4 items • Updated 14 days ago • 17

GemmaX2

GemmaX2 language models, including pretrained and instruction-tuned models of 2 sizes, including 2B, 9B. • 7 items • Updated Feb 7 • 22

upvoted 2 collections 4 months ago

Qwen2.5-1M

The long-context version of Qwen2.5, supporting 1M-token context lengths • 3 items • Updated Apr 28 • 119

DeepSeek-R1

10 items • Updated 3 days ago • 686

upvoted a collection 5 months ago

KaLM-embedding

11 items • Updated Mar 11 • 24