测绘学报 ›› 2023, Vol. 52 ›› Issue (10): 1693-1702.doi: 10.11947/j.AGCS.2023.20220286

• 摄影测量学与遥感 • 上一篇    下一篇

面向遥感图像场景分类的GLFFNet模型

王威1, 邓纪伟1, 王新1, 李智勇2, 袁平3   

  1. 1. 长沙理工大学计算机与通信工程学院, 湖南 长沙 410114;
    2. 湖南神帆科技有限公司, 湖南 长沙 410011;
    3. 长沙市竟网信息科技有限公司, 湖南 长沙 410010
  • 收稿日期:2022-05-05 修回日期:2023-02-22 发布日期:2023-10-31
  • 通讯作者: 王新 E-mail:wangxin@csust.edu.cn
  • 作者简介:王威(1974-),男,博士,教授,博士生导师,研究方向为计算机视觉和模式识别。E-mail:wangwei@csust.edu.cn
  • 基金资助:
    湖南省重点研究开发项目(2020SK2134);湖南省自然科学基金项目(2019JJ80105;2022JJ30625);长沙市科技计划项目(kq2004071)

GLFFNet model for remote sensing image scene classification

WANG Wei1, DENG Jiwei1, WANG Xin1, LI Zhiyong2, YUAN Ping3   

  1. 1. School of Computer and Communication Engineering, Changsha University of Science and Technology, Changsha 410114, China;
    2. Hunan Shenfan Science and Technology Limited Company, Changsha 410011, China;
    3. Changsha Jingwang Information and Technology Limited Company, Changsha 410010, China
  • Received:2022-05-05 Revised:2023-02-22 Published:2023-10-31
  • Supported by:
    Key Research and Development Project of Hunan Province (No. 2020SK2134);Hunan Natural Science Foundation Project (Nos. 2019JJ80105;2022JJ30625);Changsha Science and Technology Planning Project (No. kq2004071)

摘要: 传统场景分类模型不能轻量高效地完成遥感图像中多尺度的关键特征提取,深度学习方法普遍存在计算量大、收敛速度慢等缺点。针对以上问题,本文充分利用CNN结构和Transformer结构对不同尺度特征的提取能力,提出了一种全局-局部特征提取模块(global and local features fused block,GLFF),并基于此模块设计了一个轻量级遥感图像场景分类模型(GLFFNet),该模型具有较好的局部信息和全局信息提取能力。为了验证GLFFNet的有效性,本文使用开源遥感图像数据集RSSCN7与SIRI-WHU测试GLFFNet与其他深度学习网络的复杂度和识别能力。最终,GLFFNet在RSSCN7与SIRI-WHU数据集上分别取得了高达94.82%和95.83%的识别准确率,优于其他先进的模型。

关键词: 遥感图像, 场景分类, 卷积神经网络, Transformer, GLFFNet模型

Abstract: Traditional scene classification models cannot perform multi-scale key feature extraction in remote sensing images in a lightweight and efficient manner. Deep learning methods generally have shortcomings such as large amount of calculation and slow convergence speed. In view of the above problems, this paper makes full use of the ability of CNN structure and Transformer structure to extract features at different scales, and proposes a feature extract module, named global and local features fused (GLFF) block. Based on this module, a lightweight remote sensing image scene classification model, GLFFNet, is designed, which has better local information and global information extraction ability. In order to verify the effectiveness of GLFFNet, this paper uses the open-source remote sensing image datasets RSSCN7 and SIRI-WHU to verify the complexity and recognition ability of GLFFNet and other deep learning networks. Finally, GLFFNet achieves recognition accuracy of up to 94.82% and 95.83% on RSSCN7 and SIRI-WHU datasets, respectively, which is better than other state-of-the-art models.

Key words: remote sensing image, scene classification, convolutional neural network, Transformer, GLFFNet model

中图分类号: