测绘学报 ›› 2024, Vol. 53 ›› Issue (10): 1955-1966.doi: 10.11947/j.AGCS.2024.20240068.

• 遥感大模型 • 上一篇    

大模型赋能智能摄影测量:现状、挑战与前景

王密1,(), 程昫1(), 潘俊1, 皮英冬1, 肖晶2   

  1. 1.武汉大学测绘遥感信息工程国家重点实验室,湖北 武汉 430079
    2.武汉大学计算机学院,湖北 武汉 430072
  • 收稿日期:2024-02-18 发布日期:2024-11-26
  • 通讯作者: 程昫 E-mail:wangmi@whu.edu.cn;xucheng@whu.edu.cn
  • 作者简介:王密(1974—),男,博士,教授,博士生导师,主要研究方向为高精度智能卫星遥感技术。E-mail:wangmi@whu.edu.cn
  • 基金资助:
    国家重点研发计划(2022YFB3902804);国家杰出青年科学基金(62425102)

Large models enabling intelligent photogrammetry: status, challenges and prospects

Mi WANG1,(), Xu CHENG1(), Jun PAN1, Yingdong PI1, Jing XIAO2   

  1. 1.State Key Laboratory of Surveying, Mapping and Remote Sensing Information Engineering, Wuhan University, Wuhan 430079, China
    2.School of Computer Science, Wuhan University, Wuhan 430072, China
  • Received:2024-02-18 Published:2024-11-26
  • Contact: Xu CHENG E-mail:wangmi@whu.edu.cn;xucheng@whu.edu.cn
  • About author:WANG Mi (1974—), male, PhD, professor, PhD supervisor, majors in high precision satellite remote sensing technology. E-mail: wangmi@whu.edu.cn
  • Supported by:
    The National Key Research and Development Program of China(2022YFB3902804);The National Natural Science Foundation of China(62425102)

摘要:

大模型从深度学习和迁移学习技术发展而来,依靠大量的训练数据和庞大的参数容量产生规模效应,从而激发了模型的涌现能力,在众多下游任务中展现了强大的泛化性和适应性。以ChatGPT、SAM为代表的大模型标志着通用人工智能时代的到来,为地球空间信息处理的自动化与智能化提供了新的理论与技术。为了进一步探索大模型赋能泛摄影测量领域的方法与途径,本文回顾了摄影测量领域的基本问题和任务内涵,总结了深度学习方法在摄影测量智能处理中的研究成果,分析了面向特定任务的监督预训练方法的优势与局限;阐述了通用人工智能大模型的特点及研究进展,关注大模型在基础视觉任务中的场景泛化性以及三维表征方面的潜力;从训练数据、模型微调策略和异构多模态数据融合处理3个方面,探讨了大模型技术在摄影测量领域当前面临的挑战与发展趋势。

关键词: 大模型, 智能摄影测量, 深度学习, 多模态

Abstract:

Developed from deep learning and transfer learning techniques, large models leverage vast training datasets and immense parameter capacities to create scale effects, thus inspiring the model's emergent capabilities and demonstrating strong generalization and adaptability in numerous downstream tasks. Large models, represented by ChatGPT and SAM, signify the arrival of the era of general artificial intelligence, providing new theories and techniques for the automation and intelligence of Earth's spatial information processing. To further explore the methods and pathways for large models to empower the field of photogrammetry, this paper reviews the basic problems and mission tasks in the field of photogrammetry, summarizes the research achievements of deep learning methods in intelligent photogrammetric processing, analyzes the advantages and limitations of supervised pre-training methods aimed at specific tasks; Besides, we elaborates on the characteristics and research progress of general artificial intelligence large models, focusing on the generalizability of large models in basic visual tasks and the potential in three-dimensional representation; Finally, this paper explores the current challenges and future trends of large model technologies in the field of photogrammetry, from the perspectives of training data, model fine-tuning strategies, and heterogeneous multi-modal data fusion strategies.

Key words: large models, intelligent photogrammetry, deep learning, multi-modal

中图分类号: