Multimodal - a RichardForests Collection

RichardForests 's Collections

Language Models

CV

RL

Diffusion models

3D/4D Gaussian Splatting

Mamba

NeRF

Transformers & MoE

(3D) Foundation Models

SSL

DL & Software DStructures

Dora

Flash Attention in Triton

Lora variations

Parameter Efficient - LLMs

Robotics - Cross Attention

DMs - Lighting Conditions

Multimodal

updated Feb 24, 2024

Running on Zero

MCP

1.91k

1.91k

Stable Video Diffusion 1.1

📺

Generate a video from a single image
Generative Multimodal Models are In-Context Learners

Paper • 2312.13286 • Published Dec 20, 2023 • 37
COSMO: COntrastive Streamlined MultimOdal Model with Interleaved Pre-Training

Paper • 2401.00849 • Published Jan 1, 2024 • 17
TheBloke/Sonya-7B-GPTQ

Text Generation • 1B • Updated Dec 31, 2023 • 21 • 2
Sleeping

140

140

TextDiffuser 2

📚

Generate images from text prompts with layout planning
AnyGPT: Unified Multimodal LLM with Discrete Sequence Modeling

Paper • 2402.12226 • Published Feb 19, 2024 • 45