EarthMind: Towards Multi-Granular and Multi-Sensor Earth Observation with Large Multimodal Models Paper • 2506.01667 • Published 5 days ago • 21
VidText: Towards Comprehensive Evaluation for Video Text Understanding Paper • 2505.22810 • Published 10 days ago • 20