测绘学报 ›› 2020, Vol. 49 ›› Issue (8): 1051-1064.doi: 10.11947/j.AGCS.2020.20190407

• 摄影测量学与遥感 • 上一篇    下一篇

遥感影像地物分类多注意力融和U型网络法

李道纪, 郭海涛, 卢俊, 赵传, 林雨准, 余东行   

  1. 信息工程大学地理空间信息学院, 河南 郑州 450001
  • 收稿日期:2019-09-27 修回日期:2020-03-23 发布日期:2020-08-25
  • 通讯作者: 卢俊 E-mail:ljhb45@126.com
  • 作者简介:李道纪(1994-),男,硕士生,研究方向为遥感影像地物分类。E-mail:wang111@alumni.sjtu.edu.cn
  • 基金资助:
    国家自然科学基金(41601507)

A remote sensing image classification procedure based on multilevel attention fusion U-Net

LI Daoji, GUO Haitao, LU Jun, ZHAO Chuan, LIN Yuzhun, YU Donghang   

  1. Institute of Geospatial Information, Information Engineering University, Zhengzhou 450001, China
  • Received:2019-09-27 Revised:2020-03-23 Published:2020-08-25
  • Supported by:
    The National Natural Science Foundation of China (No. 41601507)

摘要: 经典的卷积神经网络在对遥感影像进行地物分类的过程中,由于影像中的地物尺寸和光谱特征差异较大、待分类目标背景环境复杂等问题,经典影像分类方法很难得到理想的分类结果。针对这些问题,本文借鉴U型卷积神经网络多层次特征融和的思想,提出了多注意力融和U型网络(MAFU-Net)。该网络利用注意力模块提取和处理不同层次的语义信息,强化不同位置像素和不同特征图之间的相关性,进而提高网络在复杂背景条件下的分类性能。为了验证本文提出的网络在遥感影像地物分类中的效果,分别在ISPRS上的Vaihingen数据集以及北京、河南两地区高分二号数据集上进行了试验,并与目前主流的语义分割网络进行了对比。试验结果表明,相比其他网络,本文提出的MAFU-Net在不同特点的数据集上均可以得到最佳的地物分类结果。同时,该网络结构简单、计算复杂度低、参数量少,具有很强的实用性。另外,本文充分利用特征可视化手段进行MAFU-Net和其他网络的分类性能对比分析,试验结果表明,目前多数深度学习网络模型的深层次原理和作用机制较为复杂,无法准确解释特定网络为何在某种数据集中会失效。这需要研究人员进一步通过更加高级的可视化表达方法和量化准则来对特定深度学习模型及网络性能进行分析评价,进而对更加高级的模型结构进行设计。

关键词: 地物分类, 遥感影像, 注意力机制, U型卷积神经网络, 语义分割

Abstract: Traditional convolutional neural network almost cannot obtain satisfactory classification results of the remote sensing images due to the large differences in the size and spectral characteristics of the objects. In addition, the complex background environment will also bring interference to the classification. Aiming at this problem, the multilevel attention fusion U-Net (MAFU-Net) is presented. To enhance the correlations between different pixels and channels, the attention module is applied to extract and process semantic information at different levels, which further improves the classification performance of the network under complex background. In order to verify the effect of the proposed network in the classification of remote sensing images, the experiments were carried out on Vaihingen dataset of ISPRS, Beijing and Henan dataset of GF 2, respectively, and several different semantic segmentation networks are used for comparison. The experimental results show that the proposed network has fewer parameters and lower computational complexity, but can achieve higher classification accuracy in the least time, which means the network is highly practical.In addition, the feature visualization was fully utilized to analyze the classification performance of MAFU-Net and other networks, and the results also show that most deep learning network models are difficult to be deduced according to the accurate mathematical principles. It is also difficult to explain why a particular network fails in a particular dataset. Therefore, the further study or more advanced visualization and quantification criteria are required to analyze and evaluate specific deep learning models and network performance, then the more advanced model structure can be designed.

Key words: object classification, remote sensing image, attention mechanism, U-shape convolutional neural network, semantic segmentation

中图分类号: