AAAI 2026

January 24, 2026

Singapore, Singapore

Would you like to see your presentation here, made available to a global audience of researchers?
Add your own presentation or have us affordably record your next conference.

The multi-modality remote sensing foundation model (MM-RSFM) has made notable progress recently. However, most existing approaches remain limited to medium-resolution, single-modality, restricting their performance in fine-grained downstream applications such as disaster response and urban planning. In this work, MaRS is proposed, a multi-modality very-high-resolution (VHR) remote sensing foundation model designed for cross-modality granularity interpretation of complex scenes. To achieve this, a multi-modality VHR SAR-optical dataset, MaRS-16M, is constructed through large-scale collection and semi-automated processing, comprising over 16 million paired samples. Unlike previous work, MaRS tackles two fundamental challenges in VHR SAR-optical self-supervised learning (SSL) techniques. Cross-granularity contrastive learning (CGCL) is introduced to alleviate alignment inconsistencies caused by imaging differences, and meta-modality attention (MMA) is designed to unify heterogeneous physical characteristics across modalities. Compared to existing remote sensing foundation models (RSFMs) and general vision foundation models (VFMs), MaRS performs better as a pre-trained backbone across nine multi-modality VHR downstream tasks.

Downloads

Paper

Next from AAAI 2026

Order-Preserving Dimension Reduction for Multimodal Semantic Embedding
poster

Order-Preserving Dimension Reduction for Multimodal Semantic Embedding

AAAI 2026

+2
Chengyu Gong and 4 other authors

24 January 2026

Stay up to date with the latest Underline news!

Select topic of interest (you can select more than one)

PRESENTATIONS

  • All Presentations
  • For Librarians
  • Resource Center
  • Free Trial
Underline Science, Inc.
1216 Broadway, 2nd Floor, New York, NY 10001, USA

© 2025 Underline - All rights reserved