Towards Visual Text Grounding of Multimodal Large Language Model Paper • 2504.04974 • Published Apr 7 • 16
LRM: Large Reconstruction Model for Single Image to 3D Paper • 2311.04400 • Published Nov 8, 2023 • 52