测绘学报 ›› 2018, Vol. 47 ›› Issue (5): 620-630.doi: 10.11947/j.AGCS.2018.20170191

• 摄影测量学与遥感 • 上一篇    下一篇

高分辨率遥感影像场景的多尺度神经网络分类法

郑卓1,2, 方芳1, 刘袁缘1, 龚希1, 郭明强1, 罗忠文1   

  1. 1. 中国地质大学(武汉)信息工程学院, 湖北 武汉 430074;
    2. 武汉大学测绘遥感信息工程国家重点实验室, 湖北 武汉 430079
  • 收稿日期:2017-04-17 修回日期:2018-02-09 出版日期:2018-05-20 发布日期:2018-06-01
  • 通讯作者: 刘袁缘 E-mail:liuyy@cug.edu.cn
  • 作者简介:郑卓(1996-),男,本科生,研究方向为深度学习,遥感影像解译。
  • 基金资助:
    国家自然科学基金(61602429;41701446);中国地质调查项目(KZ17Z618)

Joint Multi-scale Convolution Neural Network for Scene Classification of High Resolution Remote Sensing Imagery

ZHENG Zhuo1,2, FANG Fang1, LIU Yuanyuan1, GONG Xi1, GUO Mingqiang1, LUO Zhongwen1   

  1. 1. College of Information Engineering, China University of Geosciences, Wuhan 430074, China;
    2. State Key Laboratory of Information Engineering in Surveying, Mapping and Remote Sensing, Wuhan University, Wuhan 430079, China
  • Received:2017-04-17 Revised:2018-02-09 Online:2018-05-20 Published:2018-06-01
  • Supported by:
    The National Natural Science Foundation of China (Nos.61602429;41701446),Chinese Geologic Survey Project (No.KZ17Z618)

摘要: 高分辨率遥感影像场景分类是实现复杂场景快速自动识别的基础,在军事、救灾等领域有十分重要的意义。为了在有限的遥感数据集上获得高识别精度,本文提出了一种基于联合多尺度卷积神经网络模型的高分辨率遥感影像场景分类方法。不同于传统的卷积神经网络模型,JMCNN建立了一个具有3个不同尺度通道的端对端多尺度联合卷积网络模型,包括多通道特征提取器、多尺度特征联合和Softmax分类3个部分。首先,多通道特征提取器提取图像中、高层多尺度特征;然后,多尺度特征联合对多个通道的中、高层多尺度特征进行多次融合以增强特征表达;最后,Softmax对高层特征进行分类。本文在UC Merced和SIRI遥感数据集进行测试,试验表明JMCNN模型在特征表达和计算速度方面均有显著提高,在小样本数据量下分别达到89.3%和88.3%的识别精度。

关键词: 高分辨率遥感影像, 场景分类, 联合多尺度卷积神经网络, 高层特征增强表达, 有限数据集

Abstract: High resolution remote sensing imagery scene classification is important for automatic complex scene recognition, which is the key technology for military and disaster relief, etc. In this paper, we propose a novel joint multi-scale convolution neural network (JMCNN) method using a limited amount of image data for high resolution remote sensing imagery scene classification. Different from traditional convolutional neural network, the proposed JMCNN is an end-to-end training model with joint enhanced high-level feature representation, which includes multi-channel feature extractor, joint multi-scale feature fusion and Softmax classifier. Multi-channel and scale convolutional extractors are used to extract scene middle features, firstly. Then, in order to achieve enhanced high-level feature representation in a limit dataset, joint multi-scale feature fusion is proposed to combine multi-channel and scale features using two feature fusions. Finally, enhanced high-level feature representation can be used for classification by Softmax. Experiments were conducted using two limit public UCM and SIRI datasets. Compared to state-of-the-art methods, the JMCNN achieved improved performance and great robustness with average accuracies of 89.3% and 88.3% on the two datasets.

Key words: high resolution remote sensing imagery, scene classification, joint multi-scale convolution neural network, enhanced high-level feature representation, limit datasets

中图分类号: