pllm-jt-ckpt

non-profit

AI & ML interests

None defined yet.

Recent Activity

xcpan authored a paper 14 days ago

Exploring the Deep Fusion of Large Language Models and Diffusion Transformers for Text-to-Image Synthesis

xcpan authored a paper 15 days ago

PISA Experiments: Exploring Physics Post-Training for Video Diffusion Models by Watching Stuff Drop

xcpan authored a paper 15 days ago

Transfer between Modalities with MetaQueries

View all activity

pllm-jt-ckpt's activity

xcpan

authored a paper 14 days ago

Exploring the Deep Fusion of Large Language Models and Diffusion Transformers for Text-to-Image Synthesis

Paper • 2505.10046 • Published 15 days ago • 9

xcpan

authored 3 papers 15 days ago

PISA Experiments: Exploring Physics Post-Training for Video Diffusion Models by Watching Stuff Drop

Paper • 2503.09595 • Published Mar 12

Transfer between Modalities with MetaQueries

Paper • 2504.06256 • Published Apr 8

BLIP3-o: A Family of Fully Open Unified Multimodal Models-Architecture, Training and Dataset

Paper • 2505.09568 • Published 15 days ago • 85

rilynhan

authored a paper 5 months ago

Thinking in Space: How Multimodal Large Language Models See, Remember, and Recall Spaces

Paper • 2412.14171 • Published Dec 18, 2024 • 24

xcpan

authored a paper 11 months ago

Cambrian-1: A Fully Open, Vision-Centric Exploration of Multimodal LLMs

Paper • 2406.16860 • Published Jun 24, 2024 • 61

xcpan

authored 4 papers over 1 year ago

Image Sculpting: Precise Object Editing with 3D Geometry Control

Paper • 2401.01702 • Published Jan 2, 2024 • 21

Leveraging Unimodal Self-Supervised Learning for Multimodal Audio-Visual Speech Recognition

Paper • 2203.07996 • Published Feb 24, 2022

Synthesizing Coherent Story with Auto-Regressive Latent Diffusion Models

Paper • 2211.10950 • Published Nov 20, 2022

Kosmos-G: Generating Images in Context with Multimodal Large Language Models

Paper • 2310.02992 • Published Oct 4, 2023 • 4