测绘学报 ›› 2025, Vol. 54 ›› Issue (5): 911-923.doi: 10.11947/j.AGCS.2025.20240281

• 摄影测量学与遥感 • 上一篇    下一篇

基于SAR与光学遥感影像融合的多标签场景分类方法

赵一鸣1(), 胡克林2, 涂可龙1, 卿雅娴3, 杨超2, 祁昆仑1,2(), 吴华意3   

  1. 1.中国地质大学(武汉)地理与信息工程学院,湖北 武汉 430078
    2.国家地理信息系统工程技术研究中心,湖北 武汉 430078
    3.武汉大学测绘遥感信息工程全国重点实验室,湖北 武汉 430079
  • 收稿日期:2024-07-10 修回日期:2025-03-20 出版日期:2025-06-23 发布日期:2025-06-23
  • 通讯作者: 祁昆仑 E-mail:zym805805@cug.edu.cn;qikunlun@cug.edu.cn
  • 作者简介:赵一鸣(1999—),男,硕士生,研究方向为多模态遥感数据融合。 E-mail:zym805805@cug.edu.cn
  • 基金资助:
    湖北省科技厅重大专项(2020AAA004);湖北珞珈实验室专项基金(220100034)

Multi-label scene classification method based on fusion of SAR and optical remote sensing images

Yiming ZHAO1(), Kelin HU2, Kelong TU1, Yaxian QING3, Chao YANG2, Kunlun QI1,2(), Huayi WU3   

  1. 1.School of Geography and Information Engineering, China University of Geosciences, Wuhan 430078, China
    2.National Research Center for Geographic Information System Engineering Technology, Wuhan 430078, China
    3.State Key Laboratory of Information Engineering in Surveying, Mapping and Remote Sensing, Wuhan University, Wuhan 430079, China
  • Received:2024-07-10 Revised:2025-03-20 Online:2025-06-23 Published:2025-06-23
  • Contact: Kunlun QI E-mail:zym805805@cug.edu.cn;qikunlun@cug.edu.cn
  • About author:ZHAO Yiming (1999—), male, postgraduate, majors in multi-modal remote sensing image fusion. E-mail: zym805805@cug.edu.cn
  • Supported by:
    Hubei Key Research, and Development Program in China(2020AAA004);Hubei Lujia Laboratory Special Fund(220100034)

摘要:

深度卷积神经网络已被证实是高分辨率遥感影像场景分类中最有效的方法之一。过去的研究大多关注于单一光学遥感影像的场景级分类,并且多为单标签分类。然而,单一光学遥感影像容易受到天气条件的限制,并且单标签的标注难以全面描述复杂的图像内容。因此,本文利用欧洲空间局于2020年获取的SAR和光学遥感图像,构建了武汉市多模态多标签场景分类数据集SEN12-MLRS,并设计了一种基于并行双注意力融合网络(PDANet)的多标签场景分类方法。PDANet通过双分支特征提取、自适应特征融合及多级特征融合,实现了光学和SAR图像的多模态与多层级的特征融合。试验结果表明,在SEN12-MLRS数据集上,PDANet相较于多种先进模型取得了最佳性能,并通过消融试验进一步验证了本文方法的有效性。

关键词: 多模态遥感影像融合, 注意力机制, 多标签分类, 特征融合

Abstract:

Deep convolutional neural networks have proven to be one of the most effective methods for scene classification of high-resolution remote sensing images. Most previous studies focus on scene-level classification of single optical remote sensing images and are primarily limited to single-label classification. However, single optical remote sensing images are often constrained by weather conditions, and single-label annotations cannot fully describe complex image contents. Therefore, in this paper, we constructed a multimodal, multi-label scene classification dataset called SEN12-MLRS, using SAR and optical remote sensing images acquired by the European Space Agency in 2020. We proposed a parallel dual attention fusion network (PDANet) for multi-label scene classification. PDANet achieves optical and SAR image feature extraction as well as multi-modal and multilevel feature fusion through two-branch feature extraction, adaptive feature fusion, and multilevel feature fusion. Experimental results demonstrate that PDANet achieves superior performance compared to many state-of-the-art models on the SEN12-MLRS dataset. The effectiveness of the proposed network and its modules is further validated through ablation experiments.

Key words: multi-modal remote sensing image fusion, attention mechanism, multi-label classification, feature fusion

中图分类号: