Acta Geodaetica et Cartographica Sinica ›› 2020, Vol. 49 ›› Issue (8): 1051-1064.doi: 10.11947/j.AGCS.2020.20190407

• Photogrammetry and Remote Sensing • Previous Articles     Next Articles

A remote sensing image classification procedure based on multilevel attention fusion U-Net

LI Daoji, GUO Haitao, LU Jun, ZHAO Chuan, LIN Yuzhun, YU Donghang   

  1. Institute of Geospatial Information, Information Engineering University, Zhengzhou 450001, China
  • Received:2019-09-27 Revised:2020-03-23 Published:2020-08-25
  • Supported by:
    The National Natural Science Foundation of China (No. 41601507)

Abstract: Traditional convolutional neural network almost cannot obtain satisfactory classification results of the remote sensing images due to the large differences in the size and spectral characteristics of the objects. In addition, the complex background environment will also bring interference to the classification. Aiming at this problem, the multilevel attention fusion U-Net (MAFU-Net) is presented. To enhance the correlations between different pixels and channels, the attention module is applied to extract and process semantic information at different levels, which further improves the classification performance of the network under complex background. In order to verify the effect of the proposed network in the classification of remote sensing images, the experiments were carried out on Vaihingen dataset of ISPRS, Beijing and Henan dataset of GF 2, respectively, and several different semantic segmentation networks are used for comparison. The experimental results show that the proposed network has fewer parameters and lower computational complexity, but can achieve higher classification accuracy in the least time, which means the network is highly practical.In addition, the feature visualization was fully utilized to analyze the classification performance of MAFU-Net and other networks, and the results also show that most deep learning network models are difficult to be deduced according to the accurate mathematical principles. It is also difficult to explain why a particular network fails in a particular dataset. Therefore, the further study or more advanced visualization and quantification criteria are required to analyze and evaluate specific deep learning models and network performance, then the more advanced model structure can be designed.

Key words: object classification, remote sensing image, attention mechanism, U-shape convolutional neural network, semantic segmentation

CLC Number: