Muhtasham Oblokulov's picture

Muhtasham Oblokulov PRO

muhtasham

·

https://www.linkedin.com/in/muhtasham/

AI & ML interests

None yet

Recent Activity

liked a model about 12 hours ago

google/medsiglip-448

upvoted a collection about 14 hours ago

liked a model 3 days ago

ByteDance/LatentSync-1.6

View all activity

Organizations

upvoted a collection about 14 hours ago

NextCoder

NextCoder family of code-editing LMs developed with Selective Knowledge Transfer and its training data. • 6 items • Updated 3 days ago • 61

upvoted a paper 4 days ago

How Well Does GPT-4o Understand Vision? Evaluating Multimodal Foundation Models on Standard Computer Vision Tasks

Paper • 2507.01955 • Published 10 days ago • 30

upvoted a collection 6 days ago

Speech-To-Text

https://kyutai.org/next/stt • 6 items • Updated 23 days ago • 12

upvoted a collection 8 days ago

EmoNet

The full collection of our EmoNet effort. More info available at: https://huggingface.co/blog/felfri/emonet • 8 items • Updated 20 days ago • 4

upvoted a paper 12 days ago

SMMILE: An Expert-Driven Benchmark for Multimodal Medical In-Context Learning

Paper • 2506.21355 • Published 16 days ago • 9

upvoted 2 collections 16 days ago

Gemma 3n

4 items • Updated 16 days ago • 10

Gemma 3n

4 items • Updated 2 days ago • 169

upvoted a paper 21 days ago

DeepFilterNet: Perceptually Motivated Real-Time Speech Enhancement

Paper • 2305.08227 • Published May 14, 2023 • 1

upvoted an article 24 days ago

Article

How to generate text: using different decoding methods for language generation with Transformers

By

•

Mar 1, 2020

• 222

upvoted 2 papers 28 days ago

Scaling Laws for Native Multimodal Models Scaling Laws for Native Multimodal Models

Paper • 2504.07951 • Published Apr 10 • 29

SmolVLA: A Vision-Language-Action Model for Affordable and Efficient Robotics

Paper • 2506.01844 • Published Jun 2 • 113

upvoted an article 28 days ago

Article

SmolVLA: Efficient Vision-Language-Action Model trained on Lerobot Community Data

By

and 8 others •

Jun 3

• 195

upvoted an article 29 days ago

Article

LTX-Video LoRA training study (Single image/style training)

By

•

Jan 14

• 3

upvoted an article 30 days ago

Article

Introduction to 3D Gaussian Splatting

By

•

Sep 18, 2023

• 91

upvoted a paper about 1 month ago

PartCrafter: Structured 3D Mesh Generation via Compositional Latent Diffusion Transformers

Paper • 2506.05573 • Published Jun 5 • 71

upvoted 2 collections about 1 month ago

Qwen3

Chat templates replaced with Qwen2.5 template • 14 items • Updated 13 days ago • 2

SkyReels-AX

7 items • Updated Apr 13 • 6

upvoted 2 papers about 1 month ago

FlexPainter: Flexible and Multi-View Consistent Texture Generation

Paper • 2506.02620 • Published Jun 3 • 14

SkyReels-Audio: Omni Audio-Conditioned Talking Portraits in Video Diffusion Transformers

Paper • 2506.00830 • Published Jun 1 • 7

upvoted an article about 1 month ago

Article

Powerful ASR + diarization + speculative decoding with Hugging Face Inference Endpoints

By

and 3 others •

May 1, 2024

• 77