Acta Geodaetica et Cartographica Sinica ›› 2024, Vol. 53 ›› Issue (10): 1942-1954.doi: 10.11947/j.AGCS.2024.20240019.

• Remote Sensing Large Model • Previous Articles    

Multi-modal remote sensing large foundation models: current research status and future prospect

Yongjun ZHANG1,(), Yansheng LI1(), Bo DANG1, Kang WU1, Xin GUO2, Jian WANG2, Jingdong CHEN2, Ming YANG2   

  1. 1.School of Remote Sensing and Information Engineering, Wuhan University, Wuhan 430079, China
    2.Ant Group, Hangzhou 310013, China
  • Received:2024-01-12 Published:2024-11-26
  • Contact: Yansheng LI E-mail:zhangyj@whu.edu.cn;yansheng.li@whu.edu.cn
  • About author:ZHANG Yongjun (1975—), male, PhD, professor, majors in aerospace photogrammetry and remote sensing intelligent interpretation. E-mail: zhangyj@whu.edu.cn
  • Supported by:
    The National Natural Science Foundation of China(42030102)

Abstract:

The increasing remote sensing capabilities for Earth observation have eased the access to abundant data and enabled the emergence and development of remote sensing foundation models (RSFMs). Designing distinct deep neural networks and optimizing for different data and task types require substantial development efforts and prohibitively high computational resources. In order to address these issues, researchers in the remote sensing field have shifted their focus to the study of RSFMs and presented many dedicated designed unified models. To enhance the generalizability and interpretability of RSFMs, the integration of extensive geographic knowledge has been recognized as a pivotal/key approach. While existing works have explored or incorporated geographic knowledge into the architecture design or pre-training methods of RSFMs, there lacks of a comprehensive survey to review the current status of geographic knowledge-guided RSFMs. Therefore, this paper starts with summarizing and categorizing large-scale pre-training datasets and then provides an overview of the research progress in this field. Subsequently, we introduce intelligent interpretation algorithms for remote sensing imagery guided by geographic knowledge, along with advancements in the exploration and utilization of geographic knowledge specifically tailored for RSFMs. Finally, several future research prospects are outlined to tackle the persisting challenges in this field, aiming to shed light on future investigations into RSFMs.

Key words: pre-training dataset, remote sensing intelligent interpretation, remote sensing foundation models, geographic knowledge

CLC Number: