测绘学报 ›› 2022, Vol. 51 ›› Issue (3): 457-467.doi: 10.11947/j.AGCS.2022.20200601

• 摄影测量学与遥感 • 上一篇    下一篇

高分辨率遥感影像建筑物提取的空洞卷积E-Unet算法

何直蒙, 丁海勇, 安炳琪   

  1. 南京信息工程大学遥感与测绘工程学院, 江苏 南京 210044
  • 收稿日期:2020-12-15 修回日期:2021-08-04 发布日期:2022-03-30
  • 作者简介:何直蒙(1998-),女,硕士生,主要研究方向为遥感影像目标提取。E-mail:20191235004@nuist.edu.cn
  • 基金资助:
    国家自然科学基金(41571350)

E-Unet: a atrous convolution-based neural network for building extraction from high-resolution remote sensing images

HE Zhimeng, DING Haiyong, AN Bingqi   

  1. School of Remote Sensing & Geomatics Engineering, Nanjing University of Information Science and Technology, Nanjing 210044, China
  • Received:2020-12-15 Revised:2021-08-04 Published:2022-03-30
  • Supported by:
    The National Natural Science Foundation of China(No. 41571350)

摘要: 利用高分辨率遥感影像提取建筑物是目前研究热点之一,但由于建筑物颜色各异、形状大小不同、细节繁多,提取结果普遍存在边缘模糊、转角圆滑和细节丢失等问题。本文提出一种基于空洞卷积的E-Unet深度学习网络。在E-Unet的结构设计中,引入跳跃连接以减少边缘和转角的细节损失;采用新设计的卷积模块,使其扩大感受野的同时减少参数量;底层增加Dropout模块避免网络发生过拟合现象;遥感影像输入网络前先进行直方图均衡化、高斯双边滤波和波段间比值运算,然后合并为多波段张量输入模型(不转换为灰度图像)。为验证网络性能、明确性能提升的原因,本文在Massachusetts和WHU建筑物数据集上设计了两组试验。第1组是E-Unet、Unet和Res-net 3种网络的对比试验,结果表明E-Unet不仅精度评价结果优于Unet和Res-net,而且建筑物边角的细节被完整提取。第2组是消融试验,目的是明确预处理模块对提取精度的提升效果,结果表明预处理模块能提升不同网络提取精度。通过这两组试验证明了预处理模块的有效性和本文提出网络的优越性。

关键词: 深度学习, 建筑物提取, 空洞卷积, 高分辨率遥感影像, E-Unet

Abstract: The utilization of high-resolution remote sensing images to extract urban buildings is one of the current research hotspots, but owing to the different colors, shapes and sizes of buildings, and a wide range of details, the extraction results generally suffer from blurred edges, rounded corners and loss of details. For this reason, this study proposes an E-Unet deep learning network based on cavity convolution. In the structural design, jump connections are introduced to reduce the detail loss of edges and corners; a newly designed convolution module is adopted to expand the perceptual field while reducing the number of parameters; a Dropout module is added to the bottom layer of the network to avoid overfitting; histogram equalization, Gaussian bilateral filtering and inter-band ratio operations are performed on the raw data, which are then combined into a multi-band tensor input network(without conversion to grey-scale images). To validate the network performance and clarify the reasons for the performance improvement, two sets of experiments were designed in this study on the Massachusetts and WHU building datasets. The first set of experiments is a comparison experiment between the E-Unet, Unet and Res-net networks. The results show that E-Unet not only outperforms Unet and ResNet in all accuracy evaluation metrics, but also has high fidelity in the details of the extraction results. The second set of experiments are pre-processing stripping experiments to clarify the performance improvement of the network itself and the pre-processing module. The effectiveness of the pre-processing module and the superiority of the proposed network in this research are demonstrated by the two sets of experiments.

Key words: deep learning, building extraction, atrous convolution, high resolution remote sensing image, E-Unet

中图分类号: