Hugging Face
Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up
69
3000
gilangf3000
Follow
Mi6paulino's profile picture
wyla's profile picture
2 followers
Β·
5 following
gilangf3000
AI & ML interests
None yet
Recent Activity
liked
a dataset
about 20 hours ago
stallone/CommitPackFT
reacted
to
singhsidhukuldeep
's
post
with π€
7 days ago
Remember Gemini, GPT-4o, all being true multimodal models π. Now we have a paper π describing an architecture that might achieve that! Uni-MoE: a native multimodal, Unified Mixture of Experts (MoE) architecture ποΈ. Uni-MoE integrates various modalities (text π, image πΌοΈ, audio π΅, video πΉ, speech π£οΈ) using modality-specific encoders and connectors for a cohesive multimodal understanding. Training Strategy: 1οΈβ£ Training cross-modality alignment with diverse connectors π. 2οΈβ£ Training modality-specific experts using cross-modality instruction data π. 3οΈβ£Tuning the Uni-MoE framework with Low-Rank Adaptation (LoRA) on mixed multimodal data π§. Technical Details: Modality-Specific Encoders: CLIP for images πΌοΈ, Whisper for speech π£οΈ, BEATs for audio π΅. MoE-Based Blocks: Shared self-attention layers, feed-forward networks (FFN) based experts, and sparse routers for token-level expertise allocation π. Efficient Training: Utilizes LoRA for fine-tuning pre-trained experts and self-attention layers π οΈ. Uni-MoE outperforms traditional dense models on benchmarks like A-OKVQA, OK-VQA, VQAv2, MMBench, RACE-Audio, and English High School Listening Test π. The code is open-sourced as well: https://github.com/HITsz-TMG/UMOE-Scaling-Unified-Multimodal-LLMs/tree/master/Uni_MoE_v2 Paper: https://huggingface.co/papers/2405.11273
liked
a model
7 days ago
fnlp/AnyGPT-base
View all activity
Organizations
spaces
2
Sort:Β Recently updated
No application file
π
Llama
Api Llama
Sleeping
π
Cubelah
models
4
Sort:Β Recently updated
gilangf3000/llama-2-7b-miniguanaco
Updated
18 days ago
gilangf3000/wylaGPT-checkpoints
Updated
Sep 28
gilangf3000/wylaGPT-v0.1-nebula-alpha-2b
Updated
Sep 28
β’
2
β’
1
gilangf3000/gilang-ai
Updated
Nov 12, 2023
datasets
None public yet