arxiv:2412.14123

AnySat: An Earth Observation Model for Any Resolutions, Scales, and Modalities

Published on Dec 18, 2024

· Submitted by

g-astruc on Dec 19, 2024

Upvote

Authors:

Guillaume Astruc ,

Nicolas Gonthier ,

Loic Landrieu

Abstract

Anysat uses joint embedding predictive architecture and resolution-adaptive spatial encoders to train a unified multimodal model on diverse Earth observation datasets, achieving state-of-the-art performance across various environment monitoring tasks.

AI-generated summary

Geospatial models must adapt to the diversity of Earth observation data in terms of resolutions, scales, and modalities. However, existing approaches expect fixed input configurations, which limits their practical applicability. We propose AnySat, a multimodal model based on joint embedding predictive architecture (JEPA) and resolution-adaptive spatial encoders, allowing us to train a single model on highly heterogeneous data in a self-supervised manner. To demonstrate the advantages of this unified approach, we compile GeoPlex, a collection of 5 multimodal datasets with varying characteristics and 11 distinct sensors. We then train a single powerful model on these diverse datasets simultaneously. Once fine-tuned, we achieve better or near state-of-the-art results on the datasets of GeoPlex and 4 additional ones for 5 environment monitoring tasks: land cover mapping, tree species identification, crop type classification, change detection, and flood segmentation. The code and models are available at https://github.com/gastruc/AnySat.

View arXiv page View PDF Add to collection

Community

g-astruc

Paper author Paper submitter Dec 19, 2024

•

edited Dec 19, 2024

Key Features:
🌍 Versatile Model:Handles diverse datasets with resolutions spanning 3–11 channels, tiles ranging from 0.3 to 2600 hectares, and any combination of 11 sensors.
🚀 Simple to Use: Install and download AnySat with a single line of code, select your desired modalities and patch size, and immediately generate rich features.
🦋 Flexible Task Adaptation: Supports fine-tuning and linear probing for tasks like tile-wise classification and semantic segmentation.
🧑‍🎓 Multi-dataset Training: Trains a single model across multiple datasets with varying characteristics.