J C's picture

J C

dark-pen

·

AI & ML interests

None yet

Recent Activity

liked a model about 12 hours ago

Efficient-Large-Model/Sana_Sprint_0.6B_1024px_diffusers

upvoted a collection about 12 hours ago

liked a model about 12 hours ago

bghira/sana-1.6b-1024px-pseudo-camera-10k

View all activity

Organizations

None yet

dark-pen's activity

upvoted a collection about 12 hours ago

SANA-Sprint

🏃SANA-Sprint: One-Step Diffusion with Continuous-Time Consistency Distillation • 6 items • Updated 26 days ago • 36

upvoted a collection 6 days ago

🪿 RWKV7

RWKV7 models • 13 items • Updated about 1 hour ago • 7

upvoted a collection 19 days ago

Sparse Retriever models

Independent implementation of various sparse retrievers. • 2 items • Updated May 20, 2024 • 2

upvoted 2 collections 20 days ago

Qwen2-Audio

Audio-language model series based on Qwen2 • 4 items • Updated 14 days ago • 57

MAI-DS-R1

MAI-DS-R1 is a DeepSeek-R1 reasoning model that has been post-trained by the Microsoft AI team. • 2 items • Updated 12 days ago • 11

upvoted a collection about 1 month ago

Canary

A collection of multilingual and multitask speech to text models from NVIDIA NeMo 🐤 • 4 items • Updated 3 days ago • 21

upvoted an article about 1 month ago

Article

Hugging Face and Cloudflare Partner to Make Real-Time Speech and Video Seamless with FastRTC

Apr 9

• 26

upvoted a collection about 1 month ago

CoRNStack

State-of-the-art code retrieval and re-ranking models and datasets • 9 items • Updated Mar 26 • 17

upvoted 2 papers about 1 month ago

YourMT3+: Multi-instrument Music Transcription with Enhanced Transformer Architectures and Cross-dataset Stem Augmentation

Paper • 2407.04822 • Published Jul 5, 2024 • 4

Free Process Rewards without Process Labels

Paper • 2412.01981 • Published Dec 2, 2024 • 35

upvoted 2 collections about 1 month ago

tevatron-v2-data

14 items • Updated about 5 hours ago • 3

Critique-out-Loud Reward Models

Paper: https://arxiv.org/abs/2408.11791 | Code: https://github.com/zankner/CLoud • 7 items • Updated Sep 5, 2024 • 4

upvoted 2 papers about 1 month ago

EoRA: Training-free Compensation for Compressed LLM with Eigenspace Low-Rank Approximation

Paper • 2410.21271 • Published Oct 28, 2024 • 7

MixLLM: Dynamic Routing in Mixed Large Language Models

Paper • 2502.18482 • Published Feb 9 • 1

upvoted a collection about 1 month ago

L1

L1: Controlling How Long A Reasoning Model Thinks With Reinforcement Learning • 2 items • Updated Mar 7 • 5

upvoted 2 papers about 1 month ago

L1: Controlling How Long A Reasoning Model Thinks With Reinforcement Learning

Paper • 2503.04697 • Published Mar 6 • 3

A Survey of Efficient Reasoning for Large Reasoning Models: Language, Multimodality, and Beyond

Paper • 2503.21614 • Published Mar 27 • 39

upvoted a collection about 1 month ago

Flux.1-dev ControlNets

A collection of ControlNet models for Flux.1-dev by Jasper Research • 4 items • Updated Sep 24, 2024 • 23

upvoted a paper about 2 months ago

ITVTON:Virtual Try-On Diffusion Transformer Model Based on Integrated Image and Text

Paper • 2501.16757 • Published Jan 28 • 2