2 8 116

fireblade2534

AI & ML interests

None yet

Recent Activity

liked a model about 12 hours ago

Qwen/Qwen2.5-Omni-7B

liked a model 1 day ago

moonshotai/Kimi-VL-A3B-Thinking

liked a model 1 day ago

THUDM/GLM-Z1-32B-0414

View all activity

Organizations

None yet

fireblade2534's activity

liked a model about 12 hours ago

Qwen/Qwen2.5-Omni-7B

Any-to-Any • Updated about 22 hours ago • 135k • 1.39k

liked 4 models 1 day ago

liked a model 5 days ago

Norod78/huggingface-emoji-flux

Text-to-Image • Updated Dec 18, 2024 • 24 • 2

liked a model 6 days ago

suayptalha/Lamarckvergence-14B

Text Generation • Updated Mar 6 • 15.2k • 24

liked 3 models 7 days ago

HuggingFaceTB/SmolVLM2-500M-Video-Instruct

Image-Text-to-Text • Updated 8 days ago • 11.3k • 53

all-hands/openhands-lm-32b-v0.1

Text Generation • Updated 12 days ago • 99.2k • 345

reducto/RolmOCR

Image-Text-to-Text • Updated 13 days ago • 7.12k • 365

liked a Space 7 days ago

Talk to Llama 4

🦙

Talk to Llama 4 using Groq + Cloudflare

liked a model 10 days ago

Qwen/QwQ-32B

Text Generation • Updated Mar 11 • 764k • • 2.68k

reacted to hexgrad's post with 👀 13 days ago

Post

3699

To Meta AI Research: I would like to fold ylacombe/expresso into the training mix of an Apache TTS model series. Can you relax the Expresso dataset license to CC-BY or more permissive?

Barring that, can I have an individual exception to train on the materials and distribute trained Apache models, without direct redistribution of the original files? Thanks!

CC (Expresso paper authors whose handles I could find on HF) @wnhsu @adavirro @bowenshi @itaigat @TalRemez @JadeCopet @hassid @felixkreuk @adiyoss @edupoux

liked a Space 14 days ago

Advance Blur

🥸

Advance Blur anonymizes your images with "Vance Blurring."

liked a model 22 days ago

Qwen/Qwen2.5-VL-32B-Instruct

Image-Text-to-Text • Updated 1 day ago • 413k • 330

liked a model 25 days ago

mistralai/Mistral-Small-3.1-24B-Instruct-2503

Image-Text-to-Text • Updated 8 days ago • 141k • • 1.12k

liked a model about 1 month ago

Emova-ollm/emova-qwen-2-5-7b-hf

Feature Extraction • Updated Mar 13 • 48 • 2

liked a Space about 1 month ago

EMOVA Online Interactive Demo

🔥

Live Interactive demo for EMOVA with Qwen-2.5 backbone

liked a model about 1 month ago

Emova-ollm/emova-qwen-2-5-3b-hf

Feature Extraction • Updated Mar 13 • 81 • 5

reacted to KaiChen1998's post with 👍 about 1 month ago

Post

4823

📢 Our EMOVA paper has been accepted by CVPR 2025, and we are glad to release all resources, including code (training & inference), datasets (training & evaluation), and checkpoints (EMOVA-3B/7B/72B)!

🤗 EMOVA is a novel end-to-end omni-modal LLM that can see, hear and speak. Given omni-modal (i.e., textual, visual and speech) inputs, EMOVA can generate both textual and speech responses with vivid emotional controls by utilizing the speech decoder and a style controller.

✨ EMOVA Highlights
✅ State-of-the-art omni-modality: EMOVA achieves SoTA comparable results on both vision-language and speech benchmarks simultaneously.
✅ Device adaptation: our codebase supports training/inference on both NVIDIA GPUs (e.g., A800 & H20) and Ascend NPUs (e.g., 910B3)!
✅ Modular design: we integrate multiple implementations of vision encoder, vision projector, and language model, even including the most recent DeepSeekMoE-tiny!

🔥 You are all welcome to try and star!
- Project page: https://emova-ollm.github.io/
- Github: https://github.com/emova-ollm/EMOVA
- Demo: Emova-ollm/EMOVA-demo