联合显著性和多层卷积神经网络的高分影像场景分类

何小飞; 邹峥嵘; 陶超; 张佳兴

doi:10.11947/j.AGCS.2016.20150612

测绘学报 >

2016 , Vol. 45 >Issue 9: 1073 - 1080

DOI: https://doi.org/10.11947/j.AGCS.2016.20150612

摄影测量学与遥感

联合显著性和多层卷积神经网络的高分影像场景分类

何小飞 ,
邹峥嵘 ,
陶超 ,
张佳兴

展开

中南大学地球科学与信息物理学院, 湖南长沙 410083

何小飞(1991-),男,硕士,研究方向为高分遥感影像分类.E-mail:hxf0321@qq.com

收稿日期: 2015-12-04

修回日期: 2016-07-01

网络出版日期: 2016-09-29

基金资助

国家自然科学基金（41301453；51479215）；国家973计划（2012CB719903）；教育部博士点基金（20130162120027）

收起

Combined Saliency with Multi-Convolutional Neural Network for High Resolution Remote Sensing Scene Classification

HE Xiaofei ,
ZOU Zhengrong ,
TAO Chao ,
ZHANG Jiaxing

Expand

School of Geosciences and Info-Physics, Central South University, Changsha 410083, China

Received date: 2015-12-04

Revised date: 2016-07-01

Online published: 2016-09-29

Supported by

The National Natural Science Foundation of China (Nos.41301453;51479215);The National Basic Research Program of China (973 Program) (No.2012CB719903);Research Fund for the Doctoral Program of Higher Education (No.20130162120027)

Fold

摘要

高分辨率遥感影像中的场景信息，对于影像解译和现实世界的理解具有重要意义。传统的场景分类方法多利用中、低层人工特征，但是高分辨率遥感影像的信息丰富，场景构成复杂，需要高层次的特征来表达。本文提出了一种联合显著性和多层卷积神经网络的方法，首先利用显著性采样获取包含影像主要信息的有意义的块，将这些块作为样本集输入卷积神经网络中进行训练，获得不同层次的特征表达，最后联合多层特征利用支持向量机进行分类。两组高分影像场景数据UC Merced 21类和Wuhan 7类试验表明，显著性采样能够有效地获取主要目标，减弱其他无关目标的影响，降低数据冗余；卷积神经网络能够自动学习高层次的特征，相比已有方法，本文方法能够有效提高分类精度。

关键词： 卷积神经网络; 显著性探测; 高分辨率遥感影像; 场景分类

本文引用格式

何小飞 , 邹峥嵘 , 陶超 , 张佳兴 . 联合显著性和多层卷积神经网络的高分影像场景分类[J]. 测绘学报, 2016 , 45(9) : 1073 -1080 . DOI: 10.11947/j.AGCS.2016.20150612

Abstract

The scene information existing in high resolution remote sensing images is important for image interpretation and understanding of the real world. Traditional scene classification methods often use middle and low-level artificial features, but high resolution images have rich information and complex scene configuration, which need high-level feature to express. A joint saliency and multi-convolutional neural network method is proposed in this paper. Firstly, we obtain meaningful patches that include dominant image information by saliency sampling. Secondly, these patches will be set as a sample input to the convolutional neural network for training, obtain feature expression on different levels. Finally, we embed the multi-layer features into the support vector machine (SVM) for image classification. Experiments using two high resolution image scene data show that saliency sampling can effectively get the main target, weaken the impact of other unrelated targets, and reduce data redundancy; convolutional neural network can automatically learn the high-level feature, compared to existing methods, the proposed method can effectively improve the classification accuracy.

Key words： convolutional neural network; saliency detection; high resolution remote sensing image; scene classification

参考文献

[1] CHERIYADAT A M. Unsupervised Feature Learning for Aerial Scene Classification[J]. IEEE Transactions on Geoscience and Remote Sensing, 2014, 52(1): 439-451.
[2] VAILAYA A, FIGUEIREDO M A T, JAIN A K, et al. Image Classification for Content-based Indexing[J]. IEEE Transactions on Image Processing, 2001, 10(1): 117-130.
[3] SERRANO N, SAVAKIS A E, LUO Jiebo. Improved Scene Classification Using Efficient Low-level Features and Semantic Cues[J]. Pattern Recognition, 2004, 37(9): 1773-1784.
[4] OLIVA A, TORRALBA A. Modeling the Shape of the Scene: A Holistic Representation of the Spatial Envelope[J]. International Journal of Computer Vision, 2001, 42(3): 145-175.
[5] SIVIC J, ZISSERMAN A. Video Google: A Text Retrieval Approach to Object Matching in Videos[C]//Proceedings of the Ninth IEEE International Conference on Computer Vision. Nice, France: IEEE, 2003, 2: 1470-1477.
[6] LAZEBNIK S, SCHMID C, PONCE J. Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories[C]//Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition. New York, NY: IEEE, 2006: 2169-2178.
[7] YANG Yi, NEWSAM S. Spatial Pyramid Co-occurrence for Image Classification[C]//Proceedings of IEEE International Conference on Computer Vision. Barcelona: IEEE, 2011: 1465-1472.
[8] BLEI D M, NG A Y, JORDAN M I. Latent Dirichlet Allocation[J]. The Journal of Machine Learning Research, 2003, 3: 993-1022.
[9] LIENOU M, MAITRE H, DATCU M. Semantic Annotation of Satellite Images Using Latent Dirichlet Allocation[J]. IEEE Geoscience and Remote Sensing Letters, 2010, 7(1): 28-32.
[10] VĂDUVA C, GAVĂT I, DATCU M. Latent Dirichlet Allocation for Spatial Analysis of Satellite Images[J]. IEEE Transactions on Geoscience and Remote Sensing, 2013, 51(5): 2770-2786.
[11] BOSCH A, ZISSERMAN A, MUÑOZ X. Scene Classification Using a Hybrid Generative/Discriminative Approach[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2008, 30(4): 712-727.
[12] ZHANG Fan, DU Bo, ZHANG Lianpei. Saliency-guided Unsupervised Feature Learning for Scene Classification[J]. IEEE Transactions on Geoscience and Remote Sensing, 2015, 53(4): 2175-2184.
[13] 赵爽. 基于卷积神经网络的遥感图像分类方法研究[D]. 北京: 中国地质大学(北京), 2015. ZHAO Shuang. Remote Sensing Image Classification Method Based on Convolutional Neural Networks[D]. Beijing: China University of Geosciences (Beijing), 2015.
[14] LECUN Y, BOTTOU L, BENGIO Y, et al. Gradient-based Learning Applied to Document Recognition[J]. Proceedings of the IEEE, 1998, 86(11): 2278-2324.
[15] 解文杰. 基于中层语义表示的图像场景分类研究[D]. 北京: 北京交通大学, 2011. XIE Wenjie. Research on Middle Semantic Representation Based Image Scene Classification[D]. Beijing: Beijing Jiaotong University, 2011.
[16] CHENG Dongyang, SUN Tanfeng, JIANG Xinghao, et al. Unsupervised Feature Learning Using Markov Deep Belief Network[C]//Proceedings of the 2013 20th IEEE International Conference on Image Processing. Melbourne, VIC: IEEE, 2013: 260-264.
[17] 温奇, 李苓苓, 刘庆杰, 等. 基于视觉显著性和图分割的高分辨率遥感影像中人工目标区域提取[J]. 测绘学报, 2013, 42(6): 831-837. WEN Qi, LI Lingling, LIU Qingjie, et al. A Man-made Object Area Extraction Method Based on Visual Saliency Detection and Graph-cut Segmentation for High Resolution Remote Sensing Imagery[J]. Acta Geodaetica et Cartographica Sinica, 2013, 42(6): 831-837.
[18] MARGOLIN R, TAL A, ZELNIK-MANOR L. What Makesa Patch Distinct?[C]//Proceedings of 2013 IEEE Conference on Computer Vision and Pattern Recognition. Portland, OR: IEEE, 2013: 1139-1146.
[19] BENGIO Y, COURVILLE A, VINCENT P. Representation Learning: A Review and New Perspectives[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2013, 35(8): 1798-1828.
[20] 余凯, 贾磊, 陈雨强, 等. 深度学习的昨天、今天和明天[J]. 计算机研究与发展, 2013, 50(9): 1799-1804. YU Kai, JIA Lei, CHEN Yuqiang, et al. Deep Learning: Yesterday, Today, and Tomorrow[J]. Journal of Computer Research and Development, 2013, 50(9): 1799-1804.
[21] MAAS A L, HANNUN A Y, NG A Y. Rectifier Nonlinearities Improve Neural Network Acoustic Models[C]//Proceedings of ICML Workshop on Deep Learning for Audio, Speech, and Language. [S.l.]: ICML, 2013: 1.
[22] SCHMIDT M, VAN DEN BERG E, FRIEDLANDER M, et al. Optimizing Costly Functions with Simple Constraints: A Limited-memory Projected Quasi-Newton Algorithm[C]//Proceedings of the 12th International Conference on Artificial Intelligence and Statistics. Florida: ACM, 2009: 456-463.
[23] YANG Y, NEWSAM S. Bag-of-visual-words and Spatial Extensions for Land-use Classification[C]//Proceedings of the 18th SIGSPATIAL International Conference on Advances in Geographic Information Systems. San Jose: ACM, 2010: 270-279.

Options

文章导航

模态框（Modal）标题

摘要

本文引用格式

Abstract

参考文献