An Agenda for Multimodal Foundation Models for Earth Observation
- ORNL
- University of Florida, Gainesville, FL
Archives of remote sensing (RS) data are increasing swiftly as new sensing modalities with enhanced spatiotemporal resolution become operational. While promising new breakthroughs, the sheer volume of RS archives stretches the limits of human analysts and existing AI tools, as most models are: i) limited to single data modalities; ii) task-specific; iii) heavily reliant on labeled data. The emerging Foundation Models (FMs) have the potential to address these limitations. Trained on vast unlabeled datasets through self-supervised learning, FMs enable generic feature extraction that facilitate specialization to a wide variety of downstream tasks. This paper describes a vision towards an FM for multimodal Earth Observation data (FM4EO), discussing key building blocks and open challenges. We put particular emphasis on multimodal reasoning, a topic underexplored in EO. Our ultimate goal is a practical path toward FM4EO with capacity to unlock breakthroughs in few-shot learning scenarios, multimodal geographic knowledge integration, synthesis, and hypothesis generation.
- Research Organization:
- Oak Ridge National Laboratory (ORNL), Oak Ridge, TN (United States)
- Sponsoring Organization:
- USDOE
- DOE Contract Number:
- AC05-00OR22725
- OSTI ID:
- 2204580
- Resource Relation:
- Conference: International Geoscience and Remote Sensing Symposium 2023 (IGARSS) - Pasadena, California, United States of America - 7/16/2023 12:00:00 PM-7/21/2023 12:00:00 PM
- Country of Publication:
- United States
- Language:
- English
Similar Records
A comparison of histopathology imaging comprehension algorithms based on multiple instance learning
Sandia-UT Academic Alliance Project Summary