-
Exponentially Faster Language Modelling
Paper • 2311.10770 • Published • 119 -
stabilityai/stable-video-diffusion-img2vid-xt
Image-to-Video • Updated • 173k • 3.11k -
LucidDreamer: Domain-free Generation of 3D Gaussian Splatting Scenes
Paper • 2311.13384 • Published • 53 -
HierSpeech++: Bridging the Gap between Semantic and Acoustic Representation of Speech by Hierarchical Variational Inference for Zero-shot Speech Synthesis
Paper • 2311.12454 • Published • 31
Richrich
RichardForests
AI & ML interests
None yet
Recent Activity
upvoted
a
collection
about 1 month ago
G1
upvoted
an
article
5 months ago
Open-source DeepResearch – Freeing our search agents
upvoted
a
paper
7 months ago
"Give Me BF16 or Give Me Death"? Accuracy-Performance Trade-Offs in LLM
Quantization
Organizations
RL
-
Using Human Feedback to Fine-tune Diffusion Models without Any Reward Model
Paper • 2311.13231 • Published • 29 -
Nash Learning from Human Feedback
Paper • 2312.00886 • Published • 18 -
Secrets of RLHF in Large Language Models Part II: Reward Modeling
Paper • 2401.06080 • Published • 29 -
MusicRL: Aligning Music Generation to Human Preferences
Paper • 2402.04229 • Published • 17
3D/4D Gaussian Splatting
-
HiFi4G: High-Fidelity Human Performance Rendering via Compact Gaussian Splatting
Paper • 2312.03461 • Published • 17 -
COLMAP-Free 3D Gaussian Splatting
Paper • 2312.07504 • Published • 15 -
Align Your Gaussians: Text-to-4D with Dynamic 3D Gaussians and Composed Diffusion Models
Paper • 2312.13763 • Published • 11 -
AGG: Amortized Generative 3D Gaussians for Single Image to 3D
Paper • 2401.04099 • Published • 9
Mamba
-
havenhq/mamba-chat
Updated • 40 • 99 -
MoE-Mamba: Efficient Selective State Space Models with Mixture of Experts
Paper • 2401.04081 • Published • 72 -
VMamba: Visual State Space Model
Paper • 2401.10166 • Published • 40 -
Jamba: A Hybrid Transformer-Mamba Language Model
Paper • 2403.19887 • Published • 111
Transformers & MoE
-
SwitchHead: Accelerating Transformers with Mixture-of-Experts Attention
Paper • 2312.07987 • Published • 41 -
Interfacing Foundation Models' Embeddings
Paper • 2312.07532 • Published • 15 -
Point Transformer V3: Simpler, Faster, Stronger
Paper • 2312.10035 • Published • 20 -
TheBloke/quantum-v0.01-GPTQ
Text Generation • 1B • Updated • 7 • 2
SSL
Gemma & MoE
Flash Attention in Triton
Parameter Efficient - LLMs
-
PERL: Parameter Efficient Reinforcement Learning from Human Feedback
Paper • 2403.10704 • Published • 60 -
ReFT: Representation Finetuning for Language Models
Paper • 2404.03592 • Published • 100 -
Ferret-v2: An Improved Baseline for Referring and Grounding with Large Language Models
Paper • 2404.07973 • Published • 33 -
Zephyr: Direct Distillation of LM Alignment
Paper • 2310.16944 • Published • 122
LLM Agents OS
CV
-
LucidDreamer: Domain-free Generation of 3D Gaussian Splatting Scenes
Paper • 2311.13384 • Published • 53 -
Using Human Feedback to Fine-tune Diffusion Models without Any Reward Model
Paper • 2311.13231 • Published • 29 -
GaussianDreamer: Fast Generation from Text to 3D Gaussian Splatting with Point Cloud Priors
Paper • 2310.08529 • Published • 18 -
EfficientDreamer: High-Fidelity and Robust 3D Creation via Orthogonal-view Diffusion Prior
Paper • 2308.13223 • Published • 2
Diffusion models
-
FusionFrames: Efficient Architectural Aspects for Text-to-Video Generation Pipeline
Paper • 2311.13073 • Published • 58 -
MetaDreamer: Efficient Text-to-3D Creation With Disentangling Geometry and Texture
Paper • 2311.10123 • Published • 18 -
GPT4Motion: Scripting Physical Motions in Text-to-Video Generation via Blender-Oriented GPT Planning
Paper • 2311.12631 • Published • 15 -
VMC: Video Motion Customization using Temporal Attention Adaption for Text-to-Video Diffusion Models
Paper • 2312.00845 • Published • 39
Multimodal
-
Running on ZeroMCP1.93k1.93k
Stable Video Diffusion 1.1
📺Generate a video from a single image
-
Generative Multimodal Models are In-Context Learners
Paper • 2312.13286 • Published • 37 -
COSMO: COntrastive Streamlined MultimOdal Model with Interleaved Pre-Training
Paper • 2401.00849 • Published • 17 -
TheBloke/Sonya-7B-GPTQ
Text Generation • 1B • Updated • 8 • 2
NeRF
-
NeRFiller: Completing Scenes via Generative 3D Inpainting
Paper • 2312.04560 • Published • 12 -
SlimmeRF: Slimmable Radiance Fields
Paper • 2312.10034 • Published • 9 -
DL3DV-10K: A Large-Scale Scene Dataset for Deep Learning-based 3D Vision
Paper • 2312.16256 • Published • 18 -
Diffusion Priors for Dynamic View Synthesis from Monocular Videos
Paper • 2401.05583 • Published • 11
(3D) Foundation Models
DL & Software DStructures
Dora
Lora variations
-
GaLore: Memory-Efficient LLM Training by Gradient Low-Rank Projection
Paper • 2403.03507 • Published • 189 -
Flora: Low-Rank Adapters Are Secretly Gradient Compressors
Paper • 2402.03293 • Published • 6 -
PRILoRA: Pruned and Rank-Increasing Low-Rank Adaptation
Paper • 2401.11316 • Published • 1 -
MoRA: High-Rank Updating for Parameter-Efficient Fine-Tuning
Paper • 2405.12130 • Published • 51
Robotics - Cross Attention
DMs - Lighting Conditions
Language Models
-
Exponentially Faster Language Modelling
Paper • 2311.10770 • Published • 119 -
stabilityai/stable-video-diffusion-img2vid-xt
Image-to-Video • Updated • 173k • 3.11k -
LucidDreamer: Domain-free Generation of 3D Gaussian Splatting Scenes
Paper • 2311.13384 • Published • 53 -
HierSpeech++: Bridging the Gap between Semantic and Acoustic Representation of Speech by Hierarchical Variational Inference for Zero-shot Speech Synthesis
Paper • 2311.12454 • Published • 31
CV
-
LucidDreamer: Domain-free Generation of 3D Gaussian Splatting Scenes
Paper • 2311.13384 • Published • 53 -
Using Human Feedback to Fine-tune Diffusion Models without Any Reward Model
Paper • 2311.13231 • Published • 29 -
GaussianDreamer: Fast Generation from Text to 3D Gaussian Splatting with Point Cloud Priors
Paper • 2310.08529 • Published • 18 -
EfficientDreamer: High-Fidelity and Robust 3D Creation via Orthogonal-view Diffusion Prior
Paper • 2308.13223 • Published • 2
RL
-
Using Human Feedback to Fine-tune Diffusion Models without Any Reward Model
Paper • 2311.13231 • Published • 29 -
Nash Learning from Human Feedback
Paper • 2312.00886 • Published • 18 -
Secrets of RLHF in Large Language Models Part II: Reward Modeling
Paper • 2401.06080 • Published • 29 -
MusicRL: Aligning Music Generation to Human Preferences
Paper • 2402.04229 • Published • 17
Diffusion models
-
FusionFrames: Efficient Architectural Aspects for Text-to-Video Generation Pipeline
Paper • 2311.13073 • Published • 58 -
MetaDreamer: Efficient Text-to-3D Creation With Disentangling Geometry and Texture
Paper • 2311.10123 • Published • 18 -
GPT4Motion: Scripting Physical Motions in Text-to-Video Generation via Blender-Oriented GPT Planning
Paper • 2311.12631 • Published • 15 -
VMC: Video Motion Customization using Temporal Attention Adaption for Text-to-Video Diffusion Models
Paper • 2312.00845 • Published • 39
3D/4D Gaussian Splatting
-
HiFi4G: High-Fidelity Human Performance Rendering via Compact Gaussian Splatting
Paper • 2312.03461 • Published • 17 -
COLMAP-Free 3D Gaussian Splatting
Paper • 2312.07504 • Published • 15 -
Align Your Gaussians: Text-to-4D with Dynamic 3D Gaussians and Composed Diffusion Models
Paper • 2312.13763 • Published • 11 -
AGG: Amortized Generative 3D Gaussians for Single Image to 3D
Paper • 2401.04099 • Published • 9
Multimodal
-
Running on ZeroMCP1.93k1.93k
Stable Video Diffusion 1.1
📺Generate a video from a single image
-
Generative Multimodal Models are In-Context Learners
Paper • 2312.13286 • Published • 37 -
COSMO: COntrastive Streamlined MultimOdal Model with Interleaved Pre-Training
Paper • 2401.00849 • Published • 17 -
TheBloke/Sonya-7B-GPTQ
Text Generation • 1B • Updated • 8 • 2
Mamba
-
havenhq/mamba-chat
Updated • 40 • 99 -
MoE-Mamba: Efficient Selective State Space Models with Mixture of Experts
Paper • 2401.04081 • Published • 72 -
VMamba: Visual State Space Model
Paper • 2401.10166 • Published • 40 -
Jamba: A Hybrid Transformer-Mamba Language Model
Paper • 2403.19887 • Published • 111
NeRF
-
NeRFiller: Completing Scenes via Generative 3D Inpainting
Paper • 2312.04560 • Published • 12 -
SlimmeRF: Slimmable Radiance Fields
Paper • 2312.10034 • Published • 9 -
DL3DV-10K: A Large-Scale Scene Dataset for Deep Learning-based 3D Vision
Paper • 2312.16256 • Published • 18 -
Diffusion Priors for Dynamic View Synthesis from Monocular Videos
Paper • 2401.05583 • Published • 11
Transformers & MoE
-
SwitchHead: Accelerating Transformers with Mixture-of-Experts Attention
Paper • 2312.07987 • Published • 41 -
Interfacing Foundation Models' Embeddings
Paper • 2312.07532 • Published • 15 -
Point Transformer V3: Simpler, Faster, Stronger
Paper • 2312.10035 • Published • 20 -
TheBloke/quantum-v0.01-GPTQ
Text Generation • 1B • Updated • 7 • 2
(3D) Foundation Models
SSL
DL & Software DStructures
Gemma & MoE
Dora
Flash Attention in Triton
Lora variations
-
GaLore: Memory-Efficient LLM Training by Gradient Low-Rank Projection
Paper • 2403.03507 • Published • 189 -
Flora: Low-Rank Adapters Are Secretly Gradient Compressors
Paper • 2402.03293 • Published • 6 -
PRILoRA: Pruned and Rank-Increasing Low-Rank Adaptation
Paper • 2401.11316 • Published • 1 -
MoRA: High-Rank Updating for Parameter-Efficient Fine-Tuning
Paper • 2405.12130 • Published • 51
Parameter Efficient - LLMs
-
PERL: Parameter Efficient Reinforcement Learning from Human Feedback
Paper • 2403.10704 • Published • 60 -
ReFT: Representation Finetuning for Language Models
Paper • 2404.03592 • Published • 100 -
Ferret-v2: An Improved Baseline for Referring and Grounding with Large Language Models
Paper • 2404.07973 • Published • 33 -
Zephyr: Direct Distillation of LM Alignment
Paper • 2310.16944 • Published • 122
Robotics - Cross Attention
LLM Agents OS
DMs - Lighting Conditions