
测绘学报 ›› 2025, Vol. 54 ›› Issue (9): 1633-1646.doi: 10.11947/j.AGCS.2025.20230306
收稿日期:2023-12-31
修回日期:2025-08-07
出版日期:2025-10-10
发布日期:2025-10-10
通讯作者:
刘瑾
E-mail:jishunping@whu.edu.cn;liujinwhu@whu.edu.cn
作者简介:季顺平(1979—),男,博士,教授,主要研究方向为数字摄影测量、计算机视觉、遥感图像处理、深度学习等。E-mail:jishunping@whu.edu.cn
基金资助:
Shunping JI1(
), Jin LIU1,2(
), Jian GAO1, Jianya GONG1
Received:2023-12-31
Revised:2025-08-07
Online:2025-10-10
Published:2025-10-10
Contact:
Jin LIU
E-mail:jishunping@whu.edu.cn;liujinwhu@whu.edu.cn
About author:JI Shunping (1979—), male, PhD, professor, majors in digital photogrammetry, computer vision, remote sensing image processing, and deep learning, etc. E-mail: jishunping@whu.edu.cn
Supported by:摘要:
基于高分辨率立体或多视影像的地表实景三维模型重建是摄影测量和计算机视觉领域的关键研究课题,其核心是影像密集匹配技术。目前,三维重建主流算法依然以人工经验设计的传统方法为主,基于深度学习的密集匹配算法虽然近年来表现突出,但尚未在三维重建工程中得到部署应用,国内外也缺乏基于深度学习或智能方法的实景三维重建框架或系统。为促进人工智能方法在大范围地表三维场景重建任务中的落地应用,本文提出了一个以深度学习密集匹配网络为核心的通用三维重建智能框架Deep3D,包括空中三角测量、最优视角选择、深度学习密集匹配、深度图融合、三维表面模型构建等完整处理流程,用于从多视遥感影像中重建城市级实景三维表面模型。该通用框架打通了空/天一体化、双目/多视/倾斜一体化作业。其中空天一体化通过将透视变换模型和有理多项式模型纳入统一深度学习网络中实现,双目多视一体化通过自适应多视角深度特征对齐与聚合实现。本文在两套倾斜航空影像上初步测试和比较了Deep3D、商业软件和开源解决方案,证实了Deep3D性能与非开源商业软件基本持平(或略优、远优)于现有开源框架。本文还讨论了卫星多视影像三维重建中的效果。本文研究为深度学习方法的实景三维重建工程化落地应用提供了前瞻和重要参考。
中图分类号:
季顺平, 刘瑾, 高建, 龚健雅. 多视影像深度学习密集匹配三维重建智能框架[J]. 测绘学报, 2025, 54(9): 1633-1646.
Shunping JI, Jin LIU, Jian GAO, Jianya GONG. An intelligent 3D reconstruction framework via deep learning based multi-view image matching[J]. Acta Geodaetica et Cartographica Sinica, 2025, 54(9): 1633-1646.
表2
6种方案在WHU-OMVS测试区上的重建精度比较"
| 软件/方案 | PAG0.2 m/(%)↑ | PAG0.4 m/(%)↑ | PAG0.6 m/(%)↑ | MAE/m T=20 m↓ | RMSE/m T=20 m↓ | Runtime/min↓ |
|---|---|---|---|---|---|---|
| ContextCapture | 83.35 | 95.28 | 97.18 | 0.190 | 0.916 | 128 |
| Metashape | 88.34 | 95.41 | 97.21 | 0.170 | 0.972 | 193 |
| SURE-Aerial | 60.68 | 82.20 | 90.49 | 0.324 | 1.049 | 134 |
| COLMAP | 80.33 | 92.39 | 95.67 | 0.236 | 1.193 | 412 |
| OpenMVS | 83.26 | 94.09 | 96.53 | 0.202 | 0.998 | 299 |
| Deep3D | 86.75 | 95.47 | 97.60 | 0.166 | 0.803 | 187 |
表3
6种方案在天津测试区上的重建精度比较"
| 软件/方案 | PAG0.2 m/(%)↑ | PAG0.4 m/(%)↑ | PAG0.6 m/(%)↑ | MAE/m T=20 m↓ | RMSE/m T=20 m↓ | Runtime/min↓ |
|---|---|---|---|---|---|---|
| ContextCapture | 55.49 | 75.28 | 80.69 | 0.822 | 2.189 | 65.2 |
| Metashape | 63.27 | 76.42 | 80.55 | 0.811 | 2.153 | 133.2 |
| SURE-Aerial | 54.87 | 71.98 | 78.65 | 0.841 | 2.177 | 75.0 |
| COLMAP | 65.29 | 76.07 | 80.41 | 0.812 | 2.205 | 300.5 |
| OpenMVS | 63.57 | 75.38 | 80.22 | 0.830 | 2.243 | 204.5 |
| Deep3D | 66.98 | 76.83 | 81.01 | 0.818 | 2.250 | 100.9 |
| [1] | SCHÖNBERGER J L, FRAHM J M. Structure-from-motion revisited[C]//Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Las Vegas: IEEE, 2016: 4104-4113. |
| [2] | SCHÖNBERGER J L, ZHENG Enliang, FRAHM J M, et al. Pixelwise view selection for unstructured multi-view stereo[C]//Proceedings of 2016 Computer Vision-ECCV. Cham: Springer, 2016: 501-518. |
| [3] | MOULON P, MONASSE P, PERROT R, et al. OpenMVG: open multiple view geometry[C]//Proceedings of 2017 Reproducible Research in Pattern Recognition. Cham: Springer, 2017: 60-74. |
| [4] | OpenMVS: open multi-view stereo reconstruction library[CP]. [2023-07-24]. https://github.com/cdcseacave/openMVS. |
| [5] | GRIWODZ C, GASPARINI S, CALVET L, et al. AliceVision Meshroom: an open-source 3D reconstruction pipeline[C]//Proceedings of the 12th ACM Multimedia Systems Conference. Istanbul, Turkey: ACM press, 2021: 241-247. |
| [6] | RUPNIK E, DAAKIR M, PIERROT DESEILLIGNY M. MicMac-a free, open-source solution for photogrammetry[J]. Open Geospatial Data, Software and Standards, 2017, 2(1): 14. |
| [7] | HIRSCHMULLER H. Stereo processing by semiglobal matching and mutual information[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2008, 30(2): 328-341. |
| [8] | BLEYER M, RHEMANN C, ROTHER C. PatchMatch stereo - stereo matching with slanted support windows[C]//Proceedings of 2011 British Machine Vision Conference. Dundee: British Machine Vision Association, 2011: 14. |
| [9] | ROTHERMEL M, WENZEL K, FRITSCH D, et al. SURE: photogrammetric surface reconstruction from imagery[C]//Proceedings of 2012 LC3D Workshop. Berlin: [s.n.], 2012. |
| [10] |
龚健雅, 季顺平. 摄影测量与深度学习[J]. 测绘学报, 2018, 47(6): 693-704. DOI: .
doi: 10.11947/j.AGCS.2018.20170640 |
|
GONG Jianya, JI Shunping. Photogrammetry and deep learning[J]. Acta Geodaetica et Cartographica Sinica, 2018, 47(6): 693-704. DOI: .
doi: 10.11947/j.AGCS.2018.20170640 |
|
| [11] | LIU J, JI S, ZHANG C, et al. Evaluation of deep learning based stereo matching methods: from ground to aerial images[J]. The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, 2018, XLII-2: 593-597. |
| [12] | YAO Yao, LUO Zixin, LI Shiwei, et al. MVSNet: depth inference for unstructured multi-view stereo[C]//Proceedings of 2018 Computer Vision-ECCV. Cham: Springer, 2018: 785-801. |
| [13] | CHANG Jiaren, CHANG Peichun, CHEN Yongsheng. Attention-aware feature aggregation for real-time stereo matching on edge devices[C]//Proceedings of 2020 Computer Vision. Cham: Springer, 2021: 365-380. |
| [14] | ZHANG Xudong, HU Yutao, WANG Haochen, et al. Long-range attention network for multi-view stereo[C]//Proceedings of 2021 IEEE Winter Conference on Applications of Computer Vision. Waikoloa: IEEE, 2021: 3781-3790. |
| [15] | DING Yikang, YUAN Wentao, ZHU Qingtian, et al. TransMVSNet: global context-aware multi-view stereo network with transformers[C]//Proceedings of 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New Orleans: IEEE, 2022: 8575-8584. |
| [16] | ZHANG Jingyang, LI Shiwei, LUO Zixin, et al. Vis—MVSNet: visibility-aware multi-view stereo network[J]. International Journal of Computer Vision, 2023, 131(1): 199-214. |
| [17] | XU Qingshan, SU Wanjuan, QI Yuhang, et al. Learning inverse depth regression for pixel wise visibility-aware multi-view stereo networks[J]. International Journal of Computer Vision, 2022, 130(8): 2040-2059. |
| [18] | WEI Zizhuang, ZHU Qingtian, MIN Chen, et al. AA-RMVSNet: adaptive aggregation recurrent multi-view stereo network[C]//Proceedings of 2021 IEEE/CVF International Conference on Computer Vision. Montreal: IEEE, 2021: 6167-6176. |
| [19] | YAO Yao, LUO Zixin, LI Shiwei, et al. Recurrent MVSNet for high-resolution multi-view stereo depth inference[C]//Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach: IEEE, 2019: 5520-5529. |
| [20] | LIU Jin, JI Shunping. A novel recurrent encoder-decoder structure for large-scale multi-view stereo reconstruction from an open aerial dataset[C]//Proceedings of 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle: IEEE, 2020: 6049-6058. |
| [21] | YAN Jianfeng, WEI Zizhuang, YI Hongwei, et al. Dense hybrid recurrent multi-view stereo net with dynamic consistency checking[C]//Proceedings of 2020 Computer Vision. Cham: Springer, 2020: 674-689. |
| [22] | CHEN Poheng, YANG H C, CHEN Kuanwen, et al. MVSNet: learning depth-based attention pyramid features for multi-view stereo[J]. IEEE Transactions on Image Processing, 2020, 29: 7261-7273. |
| [23] | PENG Rui, WANG Rongjie, WANG Zhenyu, et al. Rethinking depth estimation for multi-view stereo: a unified representation[C]//Proceedings of 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New Orleans: IEEE, 2022: 8635-8644. |
| [24] | GU Xiaodong, FAN Zhiwen, ZHU Siyu, et al. Cascade cost volume for high-resolution multi-view stereo and stereo matching[C]//Proceedings of 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle: IEEE, 2020: 2492-2501. |
| [25] | CHENG Shuo, XU Zexiang, ZHU Shilin, et al. Deep stereo using adaptive thin volume representation with uncertainty awareness[C]//Proceedings of 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle: IEEE, 2020: 2521-2531. |
| [26] | YANG Jiayu, MAO Wei, ALVAREZ J M, et al. Cost volume pyramid based depth inference for multi-view stereo[C]//Proceedings of 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle: IEEE, 2020: 4876-4885. |
| [27] | DAI Yuchao, ZHU Zhidong, RAO Zhibo, et al. MVS2: deep unsupervised multi-view stereo with multi-view symmetry[C]//Proceedings of 2019 International Conference on 3D Vision. Quebec City: IEEE, 2019: 1-8. |
| [28] | HUANG Baichuan, YI Hongwei, HUANG Can, et al. M3VSNET: unsupervised multi-metric multi-view stereo network[C]//Proceedings of 2021 IEEE International Conference on Image Processing. Anchorage: IEEE, 2021: 3163-3167. |
| [29] | XU Hongbin, ZHOU Zhipeng, QIAO Yu, et al. Self-supervised multi-view stereo via effective co-segmentation and data-augmentation[C]//Proceedings of 2021 AAAI Conference on Artificial Intelligence. [s.l.]: PKS, 2021: 3030-3038. |
| [30] |
刘瑾, 季顺平. 基于深度学习的航空遥感影像密集匹配[J]. 测绘学报, 2019, 48(9): 1141-1150. DOI: .
doi: 10.11947/j.AGCS.2019.20180247 |
|
LIU Jin, JI Shunping. Deep learning based dense matching for aerial remote sensing images[J]. Acta Geodaetica et Cartographica Sinica, 2019, 48(9): 1141-1150. DOI: .
doi: 10.11947/j.AGCS.2019.20180247 |
|
| [31] | 上海人工智能实验室. 书生·天际[EB/OL]. [2023-07-24]. https://landmark.intern-ai.org.cn/. |
| Shanghai AI Lab. LandMark[EB/OL]. [2023-07-24]. https://landmark.intern-ai.org.cn/. | |
| [32] | CHEN Peimin, HUANG Huabing, LIU Jinying, et al. Leveraging Chinese GaoFen-7 imagery for high-resolution building height estimation in multiple cities[J]. Remote Sensing of Environment, 2023, 298: 113802. |
| [33] | FRASER C S, HANLEY H B. Bias compensation in rational functions for IKONOS satellite imagery[J]. Photogrammetric Engineering & Remote Sensing, 2003, 69(1): 53-57. |
| [34] | HE Kaiming, ZHANG Xiangyu, REN Shaoqing, et al. Spatial pyramid pooling in deep convolutional networks for visual recognition[C]//Proceedings of 2014 Computer Vision. Cham: Springer, 2014: 346-361. |
| [35] | GAO Jian, LIU Jin, JI Shunping. A general deep learning based framework for 3D reconstruction from multi-view stereo satellite images[J]. ISPRS Journal of Photogrammetry and Remote Sensing, 2023, 195: 446-461. |
| [36] | JANCOSEK M, PAJDLA T. Exploiting visibility information in surface reconstruction to preserve weakly supported surfaces[J]. International Scholarly Research Notices, 2014, 2014: 798595. |
| [37] | VU H H, LABATUT P, PONS J P, et al. High accuracy and visibility-consistent dense multiview stereo[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2012, 34(5): 889-901. |
| [38] | WAECHTER M, MOEHRLE N, GOESELE M. Let there be color! large-scale texturing of 3D reconstructions[C]//Proceedings of 2014 Computer Vision. Cham: Springer, 2014: 836-850. |
| [39] | PÉREZ P, GANGNET M, BLAKE A. Poisson image editing[J]. ACM Transactions on Graphics, 2003, 22(3): 313-318. |
| [40] | AANÆS H, JENSEN R R, VOGIATZIS G, et al. Large-scale data for multiple-view stereopsis[J]. International Journal of Computer Vision, 2016, 120(2): 153-168. |
| [41] | GAO Jian, LIU Jin, JI Shunping. Rational polynomial camera model warping for deep learning based satellite multi-view stereo matching[C]//Proceedings of 2021 IEEE/CVF International Conference on Computer Vision. Montreal: IEEE, 2021: 6128-6137. |
| [42] | BOSCH M, KURTZ Z, HAGSTROM S, et al. A multiple view stereo benchmark for satellite imagery[C]//Proceedings of 2016 IEEE Applied Imagery Pattern Recognition Workshop. Washington DC: IEEE, 2016: 1-9. |
| [43] | AGISOFT. Agisoft metashape[EB/OL]. [2023-07-24]. https://www.agisoft.com/. |
| [44] | ZHANG Kai, SNAVELY N, SUN Jin. Leveraging vision reconstruction pipelines for satellite imagery[C]//Proceedings of 2019 IEEE/CVF International Conference on Computer Vision Workshop. Seoul: IEEE, 2019: 2139-2148. |
| [45] | FAN Bin, KONG Qingqun, WANG Xinchao, et al. A performance evaluation of local features for image-based 3D reconstruction[J]. IEEE Transactions on Image Processing, 2019, 28(10): 4774-4789. |
| [46] | MA Jiayi, JIANG Xingyu, FAN Aoxiang, et al. Image matching from handcrafted to deep features: a survey[J]. International Journal of Computer Vision, 2021, 129(1): 23-79. |
| [47] | ZHANG Jingyang, YAO Yao, QUAN Long. Learning signed distance field for multi-view surface reconstruction[C]//Proceedings of 2021 IEEE/CVF International Conference on Computer Vision. Montreal: IEEE, 2021: 6505-6514. |
| [1] | 张继贤, 顾海燕, 倪欢, 李海涛, 杨懿, 丁少鹏, 隋淞蔓. 遥感智能变化检测的深度学习方法:演变与发展趋势[J]. 测绘学报, 2025, 54(8): 1347-1370. |
| [2] | 方帅, 刘加恩, 张晶. 自适应参考特征引入与多尺度特征聚合的时空融合算法[J]. 测绘学报, 2025, 54(8): 1476-1488. |
| [3] | 谢亚坤, 赵耀纪, 涂佳星, 夏瑞丰, 冯德俊, 刘苏凝, 陈虹宇, 朱军. 融合边缘与全局特征的遥感影像显著性目标检测方法[J]. 测绘学报, 2025, 54(7): 1265-1279. |
| [4] | 孟妮娜, 李凤梅, 周校东. 数据与认知双驱动的建筑物群制图综合结果与尺度一致性识别[J]. 测绘学报, 2025, 54(7): 1318-1331. |
| [5] | 王亚青, 王中辉. 异构图卷积网络支持下的河系自动选取方法[J]. 测绘学报, 2025, 54(7): 1332-1345. |
| [6] | 董子博, 王竞雪, 卜丽静, 房琳, 许峥辉. MAFNet:基于多尺度空洞融合网络的遥感影像建筑物提取方法[J]. 测绘学报, 2025, 54(6): 1094-1106. |
| [7] | 安晓亚, 郭伟茹, 张鹏鑫, 李欣欣, 石磊. 顾及几何位置和移动特征相似性的船舶轨迹聚类方法[J]. 测绘学报, 2025, 54(6): 1107-1121. |
| [8] | 李海峰, 郭旺, 吴梦伟, 彭程里, 朱庆, 刘瑜, 陶超. 视觉-语言联合的遥感地物概念表达与智能解译:原理、挑战与机遇[J]. 测绘学报, 2025, 54(5): 853-872. |
| [9] | 王超, 陈天宇, 张同, AhmedTanvir, 纪立强, 谢涛, 杨佳俊, 王帅. 基于全局差分增强模块和平衡惩罚损失的多源光学遥感影像变化检测[J]. 测绘学报, 2025, 54(5): 873-887. |
| [10] | 罗卿莉, 李雪岩, 黄国满, 陈红辉, 薛铭龙, 李健. AOSN:α-最优网络模型的山区单通道SAR高程重建方法[J]. 测绘学报, 2025, 54(5): 888-898. |
| [11] | 赵一鸣, 胡克林, 涂可龙, 卿雅娴, 杨超, 祁昆仑, 吴华意. 基于SAR与光学遥感影像融合的多标签场景分类方法[J]. 测绘学报, 2025, 54(5): 911-923. |
| [12] | 张新长, 齐霁, 陶超, 傅思扬, 郭明宁, 阮永检. 光学遥感影像去云研究进展、挑战与趋势[J]. 测绘学报, 2025, 54(4): 603-620. |
| [13] | 涂伟, 池向沅, 赵天鸿, 杨剑, 朱世平, 陈德莉. 城市排水管网流量预测多视图时空图神经网络模型[J]. 测绘学报, 2025, 54(2): 334-344. |
| [14] | 张正华, 陈国良. 一种轻量且旋转不变的激光雷达位置识别网络[J]. 测绘学报, 2025, 54(1): 90-103. |
| [15] | 龚良雄, 李星华, 程远明, 赵兴友, 谢仁平, 王红根. 时空差异增强与自适应特征融合的轻量级遥感影像变化检测网络[J]. 测绘学报, 2025, 54(1): 136-153. |
| 阅读次数 | ||||||
|
全文 |
|
|||||
|
摘要 |
|
|||||