测绘学报 ›› 2025, Vol. 54 ›› Issue (9): 1633-1646.doi: 10.11947/j.AGCS.2025.20230306

• 摄影测量学与遥感 • 上一篇    下一篇

多视影像深度学习密集匹配三维重建智能框架

季顺平1(), 刘瑾1,2(), 高建1, 龚健雅1   

  1. 1.武汉大学遥感信息工程学院,湖北 武汉 430079
    2.杭州电子科技大学通信工程学院,浙江 杭州 310018
  • 收稿日期:2023-12-31 修回日期:2025-08-07 出版日期:2025-10-10 发布日期:2025-10-10
  • 通讯作者: 刘瑾 E-mail:jishunping@whu.edu.cn;liujinwhu@whu.edu.cn
  • 作者简介:季顺平(1979—),男,博士,教授,主要研究方向为数字摄影测量、计算机视觉、遥感图像处理、深度学习等。E-mail:jishunping@whu.edu.cn
  • 基金资助:
    国家自然科学基金(42030102);国家自然科学基金(42171430)

An intelligent 3D reconstruction framework via deep learning based multi-view image matching

Shunping JI1(), Jin LIU1,2(), Jian GAO1, Jianya GONG1   

  1. 1.School of Remote Sensing and Information Engineering, Wuhan University, Wuhan 430079, China
    2.School of Communication Engineering, Hangzhou Dianzi University, Hangzhou 310018, China
  • Received:2023-12-31 Revised:2025-08-07 Online:2025-10-10 Published:2025-10-10
  • Contact: Jin LIU E-mail:jishunping@whu.edu.cn;liujinwhu@whu.edu.cn
  • About author:JI Shunping (1979—), male, PhD, professor, majors in digital photogrammetry, computer vision, remote sensing image processing, and deep learning, etc. E-mail: jishunping@whu.edu.cn
  • Supported by:
    The National Natural Science Foundation of China(42030102);The National Natural Science Foundation of China(42171430)

摘要:

基于高分辨率立体或多视影像的地表实景三维模型重建是摄影测量和计算机视觉领域的关键研究课题,其核心是影像密集匹配技术。目前,三维重建主流算法依然以人工经验设计的传统方法为主,基于深度学习的密集匹配算法虽然近年来表现突出,但尚未在三维重建工程中得到部署应用,国内外也缺乏基于深度学习或智能方法的实景三维重建框架或系统。为促进人工智能方法在大范围地表三维场景重建任务中的落地应用,本文提出了一个以深度学习密集匹配网络为核心的通用三维重建智能框架Deep3D,包括空中三角测量、最优视角选择、深度学习密集匹配、深度图融合、三维表面模型构建等完整处理流程,用于从多视遥感影像中重建城市级实景三维表面模型。该通用框架打通了空/天一体化、双目/多视/倾斜一体化作业。其中空天一体化通过将透视变换模型和有理多项式模型纳入统一深度学习网络中实现,双目多视一体化通过自适应多视角深度特征对齐与聚合实现。本文在两套倾斜航空影像上初步测试和比较了Deep3D、商业软件和开源解决方案,证实了Deep3D性能与非开源商业软件基本持平(或略优、远优)于现有开源框架。本文还讨论了卫星多视影像三维重建中的效果。本文研究为深度学习方法的实景三维重建工程化落地应用提供了前瞻和重要参考。

关键词: 三维重建框架, 深度学习, 影像密集匹配, 遥感影像, 实景三维模型

Abstract:

The real-scene 3D model reconstruction of the ground surface based on high-resolution stereo or multi-view images is a key research topic in photogrammetry and computer vision, with dense image matching being the core technologies. At present, the mainstream 3D reconstruction algorithms are still based on the manual-designed methods. Although deep learning-based dense matching algorithms have shown excellent performance in recent years, they have not yet been deployed in 3D reconstruction projects, and there are few reports on the deployment of 3D reconstruction frameworks or software based on deep learning or intelligent methods, both domestically and internationally. To promote the application of modern artificial intelligence methods in large-scale 3D surface reconstruction task, this article proposes a general intelligent framework for real-scene 3D reconstruction called Deep3D, with the core component being a deep learning dense matching network. This framework includes complete processes of aerial triangulation, optimal view selection, deep learning-based dense matching, depth map fusion, and 3D surface model reconstruction, aiming for urban-level real-scene 3D surface reconstruction from multi-view remote sensing images. This general framework integrates the processing of aerial and satellite images by incorporating the perspective model and the rational polynomial coefficient model into the network, as well as the processing of binocular, multi-view and oblique view images by using adaptive multi-view alignment and aggregation strategies. This paper compares the Deep3D framework, software and open source solutions on two sets of oblique aerial images, and confirms that the proposed Deep3D framework performs essentially on par with or slightly better than software, far better than existing open source frameworks. This article also discusses the performance on satellite multi-view images of different methods. This study provides an outlook and reference for the application of deep learning methods in the real-scene 3D reconstruction projects.

Key words: 3D reconstruction framework, deep learning, image dense matching, remote sensing images, real-scene 3D model

中图分类号: