MAESTRO: Masked AutoEncoders for Multimodal, Multitemporal, and Multispectral Earth Observation Data

  • 2025-08-14 17:58:45
  • Antoine Labatie, Michael Vaccaro, Nina Lardiere, Anatol Garioud, Nicolas Gonthier
  • 0

Abstract

Self-supervised learning holds great promise for remote sensing, but standardself-supervised methods must be adapted to the unique characteristics of Earthobservation data. We take a step in this direction by conducting acomprehensive benchmark of fusion strategies and reconstruction targetnormalization schemes for multimodal, multitemporal, and multispectral Earthobservation data. Based on our findings, we propose MAESTRO, a novel adaptationof the Masked Autoencoder, featuring optimized fusion strategies and a tailoredtarget normalization scheme that introduces a spectral prior as aself-supervisory signal. Evaluated on four Earth observation datasets, MAESTROsets a new state-of-the-art on tasks that strongly rely on multitemporaldynamics, while remaining highly competitive on tasks dominated by a singlemono-temporal modality. Code to reproduce all our experiments is available athttps://github.com/ignf/maestro.

 

Quick Read (beta)

loading the full paper ...