NG's picture

132 255

NG

SirRa1zel

·

AI & ML interests

Text-to-Speech, Translation, Object Detection

Recent Activity

liked a model 4 days ago

nari-labs/Dia-1.6B

liked a Space 6 days ago

hkchengrex/MMAudio

liked a Space 14 days ago

MohamedRashad/Orpheus-TTS

View all activity

Organizations

None yet

SirRa1zel's activity

upvoted a collection 15 days ago

Orpheus Multilingual Research Release

Beta Release of multilingual models. • 12 items • Updated 16 days ago • 76

upvoted a paper 25 days ago

TextCrafter: Accurately Rendering Multiple Texts in Complex Visual Scenes

Paper • 2503.23461 • Published 27 days ago • 94

upvoted a paper about 1 month ago

Long-Video Audio Synthesis with Multi-Agent Collaboration

Paper • 2503.10719 • Published Mar 13 • 9

upvoted 3 papers about 2 months ago

TheoremExplainAgent: Towards Multimodal Explanations for LLM Theorem Understanding

Paper • 2502.19400 • Published Feb 26 • 49

ART: Anonymous Region Transformer for Variable Multi-Layer Transparent Image Generation

Paper • 2502.18364 • Published Feb 25 • 36

MutaGReP: Execution-Free Repository-Grounded Plan Search for Code-Use

Paper • 2502.15872 • Published Feb 21 • 5

upvoted an article 3 months ago

Article

Open-source DeepResearch – Freeing our search agents

Feb 4

• 1.23k

upvoted a collection 3 months ago

Qwen2.5-VL

Vision-language model series based on Qwen2.5 • 11 items • Updated 26 days ago • 448

upvoted 4 papers 3 months ago

FilmAgent: A Multi-Agent Framework for End-to-End Film Automation in Virtual 3D Spaces

Paper • 2501.12909 • Published Jan 22 • 70

MMVU: Measuring Expert-Level Multi-Discipline Video Understanding

Paper • 2501.12380 • Published Jan 21 • 86

HiFi-SR: A Unified Generative Transformer-Convolutional Adversarial Network for High-Fidelity Speech Super-Resolution

Paper • 2501.10045 • Published Jan 17 • 9

SynthLight: Portrait Relighting with Diffusion Model by Learning to Re-render Synthetic Faces

Paper • 2501.09756 • Published Jan 16 • 19

upvoted a collection 3 months ago

OuteTTS 0.3

4 items • Updated 19 days ago • 18

upvoted a paper 3 months ago

MangaNinja: Line Art Colorization with Precise Reference Following

Paper • 2501.08332 • Published Jan 14 • 60

upvoted a collection 3 months ago

Visual Document Retrieval

A collection of models, datasets, and spaces in the VDR series • 5 items • Updated Jan 10 • 8

upvoted 3 papers 3 months ago

UnCommon Objects in 3D

Paper • 2501.07574 • Published Jan 13 • 13

LlamaV-o1: Rethinking Step-by-step Visual Reasoning in LLMs

Paper • 2501.06186 • Published Jan 10 • 66

Sa2VA: Marrying SAM2 with LLaVA for Dense Grounded Understanding of Images and Videos

Paper • 2501.04001 • Published Jan 7 • 46

upvoted a collection 4 months ago

Cosmos

The collection of Cosmos models • 31 items • Updated 3 days ago • 283

upvoted a paper 5 months ago

DiffSensei: Bridging Multi-Modal LLMs and Diffusion Models for Customized Manga Generation

Paper • 2412.07589 • Published Dec 10, 2024 • 49