测绘学报 ›› 2018, Vol. 47 ›› Issue (6): 864-872.doi: 10.11947/j.AGCS.2018.20170651

• 数字摄影测量与深度学习方法 • 上一篇    下一篇

基于U型卷积神经网络的航空影像建筑物检测

伍广明1, 陈奇1,2, Ryosuke SHIBASAKI1, 郭直灵1, 邵肖伟1, 许永伟1   

  1. 1. 东京大学空间信息科学研究中心, 日本 东京 113-8657;
    2. 中国地质大学(武汉)信息工程学院, 湖北 武汉 430074
  • 收稿日期:2017-12-12 修回日期:2018-03-25 出版日期:2018-06-20 发布日期:2018-06-21
  • 通讯作者: 邵肖伟 E-mail:shaoxw@iis.u-tokyo.ac.jp
  • 作者简介:伍广明(1991-),男,博士生,研究方向为计算机视觉和人流模式分析。E-mail:huster-wgm@csis.u-tokyo.ac.jp
  • 基金资助:
    日本文部科学省GRENE-ei项目;国家自然科学基金(41601506);中国博士后科学基金(2016M590730)

High Precision Building Detection from Aerial Imagery Using a U-Net Like Convolutional Architecture

WU Guangming1, CHEN Qi1,2, Ryosuke SHIBASAKI1, GUO Zhiling1, SHAO Xiaowei1, XU Yongwei1   

  1. 1. Center for Spatial Information Science, University of Tokyo, Tokyo 113-8657, Japan;
    2. Faculty of Information Engineering, China University of Geosciences, Wuhan 430074, China
  • Received:2017-12-12 Revised:2018-03-25 Online:2018-06-20 Published:2018-06-21
  • Supported by:
    The GRENE-ei (Green Network of Excellence,Environmental Information) Program;The National Natural Science Foundation of China (No.41601506);The China Postdoctoral Science Foundation (No.2016M590730)

摘要: 经典的卷积神经网络结构在前向传播过程中分辨率不断下降,导致仅采用末层特征时难以实现建筑物边缘的精确分割,进而限制目标检测精度。针对该问题,提出一种基于U型卷积网络的建筑物检测方法。首先借鉴在图像分割领域中性能出色的神经网络模型U-Net的建模思想,采用对称式的网络结构融合深度网络中的高维和低维特征以恢复高保真边界;其次考虑到经典U-Net对位于特征金字塔顶层的模型参数优化程度相对不足,通过在顶层和底层两个不同尺度输出预测结果进行双重约束,进一步提升了建筑物检测精度。在覆盖范围达30 km2、建筑物目标28 000余个的航空影像数据集上的试验结果表明,本文方法的检测结果在IoU和Kappa两项关键评价指标的均值上分别达到83.7%和89.5%,优于经典U-Net模型,显著优于经典全卷积网络模型和基于人工设计特征的AdaBoost模型。

关键词: 航空影像, 建筑物检测, 卷积神经网络, U型卷积网络, 特征金字塔

Abstract: Automatic identification of the building target and precise acquisition of its vector contour has been an urgent task which is at the same time facing huge challenges.In recent years,due to its ability of automatically extracting high-dimensional abstract features with extremely high complexity,convolutional neural network (CNN) have made considerable improvement in this research area,and strongly enhanced the classification accuracy and generalization capability of the state-of-art building detection methods.However,the pooling layers in a classic CNN model actually considerably reduce the spatial resolution of the input image,the building detection results generated from the top layer of CNN often have coarse edges,which poses big challenges for extracting accurate building contour.In order to tackle this problem,an improved fully convolutional network based on U-Net is proposed.First,the structure of U-Net is adopted to detect accurate building edge by using a bottom-up refinement process.Then,by predicting results in both top and bottom layers with the feature pyramid,a twofold constraint strategy is proposed to further improve the detection accuracy.Experiments on aerial imagery datasets covering 30 square kilometers and over 28 000 buildings demonstrate that proposed method performs well for different areas.The accuracy values in the form of average IoU and Kappa are 83.7% and 89.5%,respectively;which are higher than the classic U-Net model,and significantly outperforms the classic full convolutional network model and the AdaBoost model trained with low-level features.

Key words: aerial imagery, building detection, convolutional neural network, U-Net, feature pyramid

中图分类号: