20 25 26

Yozh

justheuristic

justheuristic

AI & ML interests

None yet

Recent Activity

upvoted a paper 1 day ago

ZebraLogic: On the Scaling Limits of LLMs for Logical Reasoning

liked a model 4 days ago

Manmay/tortoise-tts

upvoted a paper 11 days ago

Qwen2.5-1M Technical Report

View all activity

Organizations

upvoted a paper 1 day ago

ZebraLogic: On the Scaling Limits of LLMs for Logical Reasoning

Paper • 2502.01100 • Published Feb 3, 2025 • 20

liked a model 4 days ago

Manmay/tortoise-tts

Updated Oct 25, 2023 • 20

upvoted a paper 11 days ago

Qwen2.5-1M Technical Report

Paper • 2501.15383 • Published Jan 26, 2025 • 72

liked a model 5 months ago

openai/gpt-oss-120b

Text Generation • 120B • Updated Aug 26, 2025 • 3.25M • • 4.32k

liked 2 models 6 months ago

ByteDance-Seed/BAGEL-7B-MoT

Any-to-Any • 15B • Updated about 1 hour ago • 555 • 1.17k

moonshotai/Kimi-K2-Instruct

Text Generation • 1T • Updated Nov 7, 2025 • 60.8k • • 2.3k

liked a dataset 7 months ago

yandex/mad-cars

Viewer • Updated Jun 29, 2025 • 5.88M • 152 • 32

upvoted an article 7 months ago

Article

Learn the Hugging Face Kernel Hub in 5 Minutes

Jun 12, 2025

•

151

upvoted a paper 7 months ago

Magistral

Paper • 2506.10910 • Published Jun 12, 2025 • 66

upvoted 2 papers 8 months ago

Alchemist: Turning Public Text-to-Image Data into Generative Gold

Paper • 2505.19297 • Published May 25, 2025 • 84

Quartet: Native FP4 Training Can Be Optimal for Large Language Models

Paper • 2505.14669 • Published May 20, 2025 • 78

upvoted an article 8 months ago

Article

4D masks support in Transformers

Jan 8, 2024

•

upvoted 2 papers 9 months ago

Learning Adaptive Parallel Reasoning with Language Models

Paper • 2504.15466 • Published Apr 21, 2025 • 44

PRIMA.CPP: Speeding Up 70B-Scale LLM Inference on Low-Resource Everyday Home Clusters

Paper • 2504.08791 • Published Apr 7, 2025 • 137

commented a paper 9 months ago

Hogwild! Inference: Parallel LLM Generation via Concurrent Attention

Paper • 2504.06261 • Published Apr 8, 2025 • 110 •

upvoted 5 papers 9 months ago

Towards System 2 Reasoning in LLMs: Learning How to Think With Meta Chain-of-Though

Paper • 2501.04682 • Published Jan 8, 2025 • 99

An Empirical Study of GPT-4o Image Generation Capabilities

Paper • 2504.05979 • Published Apr 8, 2025 • 64

Skywork R1V: Pioneering Multimodal Reasoning with Chain-of-Thought

Paper • 2504.05599 • Published Apr 8, 2025 • 85

HybriMoE: Hybrid CPU-GPU Scheduling and Cache Management for Efficient MoE Inference

Paper • 2504.05897 • Published Apr 8, 2025 • 21

Accelerate Parallelizable Reasoning via Parallel Decoding within One Sequence

Paper • 2503.20533 • Published Mar 26, 2025 • 12