JavisGPT-dev

community

AI & ML interests

None defined yet.

Recent Activity

Jungang authored a paper 17 days ago

Vidu S1: A Real-Time Interactive Video Generation Model

Jungang authored a paper 2 months ago

CM-EVS: Sparse Panoramic RGB-D-Pose Data for Complete Scene Coverage

Jungang submitted a paper 2 months ago

CM-EVS: Sparse Panoramic RGB-D-Pose Data for Complete Scene Coverage

View all activity

authored a paper 17 days ago

Vidu S1: A Real-Time Interactive Video Generation Model

Paper • 2607.03118 • Published 25 days ago • 141

authored a paper 2 months ago

CM-EVS: Sparse Panoramic RGB-D-Pose Data for Complete Scene Coverage

Paper • 2605.15597 • Published May 15 • 11

submitted a paper to Daily Papers 2 months ago

CM-EVS: Sparse Panoramic RGB-D-Pose Data for Complete Scene Coverage

Paper • 2605.15597 • Published May 15 • 11

authored a paper 3 months ago

Mobile GUI Agent Privacy Personalization with Trajectory Induced Preference Optimization

Paper • 2604.11259 • Published Apr 13 • 12

submitted a paper to Daily Papers 3 months ago

Mobile GUI Agent Privacy Personalization with Trajectory Induced Preference Optimization

Paper • 2604.11259 • Published Apr 13 • 12

authored 3 papers 4 months ago

Unlocking Multimodal Document Intelligence: From Current Triumphs to Future Frontiers of Visual Document Retrieval

Paper • 2602.19961 • Published Feb 23 • 2

Temporal Gains, Spatial Costs: Revisiting Video Fine-Tuning in Multimodal Large Language Models

Paper • 2603.17541 • Published Mar 18 • 20

AndroTMem: From Interaction Trajectories to Anchored Memory in Long-Horizon GUI Agents

Paper • 2603.18429 • Published Mar 19 • 26

submitted 2 papers to Daily Papers 4 months ago

AndroTMem: From Interaction Trajectories to Anchored Memory in Long-Horizon GUI Agents

Paper • 2603.18429 • Published Mar 19 • 26

Temporal Gains, Spatial Costs: Revisiting Video Fine-Tuning in Multimodal Large Language Models

Paper • 2603.17541 • Published Mar 18 • 20

submitted a paper to Daily Papers 5 months ago

JavisDiT++: Unified Modeling and Optimization for Joint Audio-Video Generation

Paper • 2602.19163 • Published Feb 22 • 14

authored a paper 5 months ago

BPDQ: Bit-Plane Decomposition Quantization on a Variable Grid for Large Language Models

Paper • 2602.04163 • Published Feb 4 • 10

submitted a paper to Daily Papers 5 months ago

BPDQ: Bit-Plane Decomposition Quantization on a Variable Grid for Large Language Models

Paper • 2602.04163 • Published Feb 4 • 10

authored a paper 6 months ago

OmniSIFT: Modality-Asymmetric Token Compression for Efficient Omni-modal Large Language Models

Paper • 2602.04804 • Published Feb 4 • 50

submitted a paper to Daily Papers 6 months ago

OmniSIFT: Modality-Asymmetric Token Compression for Efficient Omni-modal Large Language Models

Paper • 2602.04804 • Published Feb 4 • 50

authored a paper 7 months ago

JavisGPT: A Unified Multi-modal LLM for Sounding-Video Comprehension and Generation

Paper • 2512.22905 • Published Dec 28, 2025 • 20

submitted a paper to Daily Papers 7 months ago

JavisGPT: A Unified Multi-modal LLM for Sounding-Video Comprehension and Generation

Paper • 2512.22905 • Published Dec 28, 2025 • 20

authored a paper 10 months ago

Taming Text-to-Sounding Video Generation via Advanced Modality Condition and Interaction

Paper • 2510.03117 • Published Oct 3, 2025 • 12

authored 2 papers 10 months ago

Exploring Response Uncertainty in MLLMs: An Empirical Evaluation under Misleading Scenarios

Paper • 2411.02708 • Published Nov 5, 2024 • 1

MOSS-ChatV: Reinforcement Learning with Process Reasoning Reward for Video Temporal Reasoning

Paper • 2509.21113 • Published Sep 25, 2025 • 6