测绘学报 ›› 2024, Vol. 53 ›› Issue (10): 1942-1954.doi: 10.11947/j.AGCS.2024.20240019.

• 遥感大模型 • 上一篇    

多模态遥感基础大模型:研究现状与未来展望

张永军1,(), 李彦胜1(), 党博1, 武康1, 郭昕2, 王剑2, 陈景东2, 杨铭2   

  1. 1.武汉大学遥感信息工程学院,湖北 武汉 430079
    2.蚂蚁集团,浙江 杭州 310013
  • 收稿日期:2024-01-12 发布日期:2024-11-26
  • 通讯作者: 李彦胜 E-mail:zhangyj@whu.edu.cn;yansheng.li@whu.edu.cn
  • 作者简介:张永军(1975—),男,博士,教授,研究方向为航空航天摄影测量与遥感影像智能解译。E-mail:zhangyj@whu.edu.cn
  • 基金资助:
    国家自然科学基金(42030102)

Multi-modal remote sensing large foundation models: current research status and future prospect

Yongjun ZHANG1,(), Yansheng LI1(), Bo DANG1, Kang WU1, Xin GUO2, Jian WANG2, Jingdong CHEN2, Ming YANG2   

  1. 1.School of Remote Sensing and Information Engineering, Wuhan University, Wuhan 430079, China
    2.Ant Group, Hangzhou 310013, China
  • Received:2024-01-12 Published:2024-11-26
  • Contact: Yansheng LI E-mail:zhangyj@whu.edu.cn;yansheng.li@whu.edu.cn
  • About author:ZHANG Yongjun (1975—), male, PhD, professor, majors in aerospace photogrammetry and remote sensing intelligent interpretation. E-mail: zhangyj@whu.edu.cn
  • Supported by:
    The National Natural Science Foundation of China(42030102)

摘要:

遥感对地观测能力的稳步提升为遥感基础大模型的涌现和发展奠定了数据基础。针对不同数据及任务类型,设计不同的深度网络骨架及优化方法必将浪费大量人力物力。为了解决上述问题,国内外研究学者转入遥感基础大模型研究,并提出了大量优秀统一模型。为提高遥感基础大模型的泛化性和可解释性,引入泛在的地学知识被认为是一项关键技术。目前,已有相关工作在遥感基础大模型的结构设计或预训练方法中挖掘或整合了地学知识,但尚无文献系统性阐述和总结地学知识引导的遥感基础大模型的研究现状。因此,本文首先对大规模遥感基础模型预训练数据集进行了归纳和总结,并分类回顾了遥感基础大模型的研究进展;然后,介绍了地学知识引导的遥感影像智能解译算法以及面向遥感基础大模型的地学知识挖掘与利用进展;最后,针对该领域仍然面临的挑战提出了几点未来研究展望,旨在为遥感基础大模型的未来研究提供探索方向参考。

关键词: 预训练数据集, 遥感智能解译, 遥感基础大模型, 地学知识

Abstract:

The increasing remote sensing capabilities for Earth observation have eased the access to abundant data and enabled the emergence and development of remote sensing foundation models (RSFMs). Designing distinct deep neural networks and optimizing for different data and task types require substantial development efforts and prohibitively high computational resources. In order to address these issues, researchers in the remote sensing field have shifted their focus to the study of RSFMs and presented many dedicated designed unified models. To enhance the generalizability and interpretability of RSFMs, the integration of extensive geographic knowledge has been recognized as a pivotal/key approach. While existing works have explored or incorporated geographic knowledge into the architecture design or pre-training methods of RSFMs, there lacks of a comprehensive survey to review the current status of geographic knowledge-guided RSFMs. Therefore, this paper starts with summarizing and categorizing large-scale pre-training datasets and then provides an overview of the research progress in this field. Subsequently, we introduce intelligent interpretation algorithms for remote sensing imagery guided by geographic knowledge, along with advancements in the exploration and utilization of geographic knowledge specifically tailored for RSFMs. Finally, several future research prospects are outlined to tackle the persisting challenges in this field, aiming to shed light on future investigations into RSFMs.

Key words: pre-training dataset, remote sensing intelligent interpretation, remote sensing foundation models, geographic knowledge

中图分类号: