测绘学报 ›› 2023, Vol. 52 ›› Issue (6): 980-989.doi: 10.11947/j.AGCS.2023.20210684

• 摄影测量学与遥感 • 上一篇    下一篇

改进U-Net的遥感图像语义分割方法

胡功明1, 杨春成1,2,3, 徐立1, 尚海滨1, 王泽凡2, 秦志龙1   

  1. 1. 中国地质大学(武汉)国家地理信息系统工程技术研究中心, 湖北 武汉 430078;
    2. 中国地质大学(武汉)地质探测与评估教育部重点实验室, 湖北 武汉 430074;
    3. 中国地质大学(武汉)地理与信息工程学院, 湖北 武汉 430078
  • 收稿日期:2021-12-10 修回日期:2022-05-23 发布日期:2023-07-08
  • 通讯作者: 杨春成 E-mail:yangcc@cug.edu.cn
  • 作者简介:胡功明(1997-),男,硕士,主要研究方向为遥感语义分割。E-mail:gongminghu@cug.edu.cn
  • 基金资助:
    国家自然科学基金(42171438)

Improved U-Net remote sensing image semantic segmentation method

HU Gongming1, YANG Chuncheng1,2,3, XU Li1, SHANG Haibin1, WANG Zefan2, QIN Zhilong1   

  1. 1. National Engineering Research Center of Geographic Information System, China University of Geosciences, Wuhan 430078, China;
    2. Key Laboratory of Geological Survey and Evaluation of Ministry of Education, China University of Geosciences, Wuhan 430074, China;
    3. School of Geography and Information Engineering, China University of Geosciences, Wuhan 430078, China
  • Received:2021-12-10 Revised:2022-05-23 Published:2023-07-08
  • Supported by:
    The National Natural Science Foundation of China (No. 42171438)

摘要: 利用深度神经网络进行遥感影像语义分割是遥感智能解译的一个重要内容,在城市规划、灾害评估及农业生产等领域具有十分重要的作用。高分辨率遥感影像具有背景复杂、尺度多样及形状不规则等特点,使用自然场景语义分割方法处理遥感图像往往存在分割精度低的问题。针对上述情况,本文在U-Net模型基础上,提出了一种多尺度跳跃连接方法来融合不同层次的语义特征,获取准确的分割边界与位置信息;引入注意力机制和金字塔池化解决复杂背景下的精细分割问题。为了验证本文方法的有效性,在WHDLD和LandCover.ai数据集上进行试验,并与主流语义分割方法进行对比。试验结果表明,本文方法的mIoU分别达到74.28%和82.04%,F1均值达到84.47%和89.76%,均优于其他对比方法;相比于U-Net的分割结果,IoU在建筑物、道路等占比较少的类别上提升明显,且优于其他对比方法。

关键词: 遥感语义分割, U-Net, 注意力机制, 多尺度跳跃连接, 金字塔池化

Abstract: Semantic segmentation of remote sensing images by deep neural network is an important content of remote sensing intelligent interpretation, which plays a very important role in urban planning, disaster assessment, agricultural production and other fields. High resolution remote sensing images are characterized by complex background, diverse scales and irregular shape, etc. Therefore, using natural scene semantic segmentation methods to process remote sensing images often has the problem of low segmentation accuracy. Based on the U-Net model, a multi-scale skip connection method is proposed to integrate semantic features of different levels and obtain accurate segmentation boundary and location information. Attention mechanism and pyramid pooling are introduced to solve the problem of fine segmentation in complex background. In order to verify the effectiveness of our proposed method, experiments were carried out on the WHDLD and LandCover.ai dataset and compared with the mainstream semantic segmentation methods. The experimental results show that the proposed method outperforms other comparison methods, with mIoU reaching 74.28% and 82.04% respectively, and with average of F1 score reaching 84.47% and 89.76% respectively; compared with the segmentation results of U-Net, the value of IoU improves significantly in buildings, roads and other categories with a relatively small proportion, and is superior to other comparison methods.

Key words: remote sensing semantic segmentation, U-Net, attention mechanism, multi-scale skip connetion, pyramid pooling

中图分类号: