Acta Geodaetica et Cartographica Sinica ›› 2025, Vol. 54 ›› Issue (5): 911-923.doi: 10.11947/j.AGCS.2025.20240281

• Photogrammetry and Remote Sensing • Previous Articles     Next Articles

Multi-label scene classification method based on fusion of SAR and optical remote sensing images

Yiming ZHAO1(), Kelin HU2, Kelong TU1, Yaxian QING3, Chao YANG2, Kunlun QI1,2(), Huayi WU3   

  1. 1.School of Geography and Information Engineering, China University of Geosciences, Wuhan 430078, China
    2.National Research Center for Geographic Information System Engineering Technology, Wuhan 430078, China
    3.State Key Laboratory of Information Engineering in Surveying, Mapping and Remote Sensing, Wuhan University, Wuhan 430079, China
  • Received:2024-07-10 Revised:2025-03-20 Online:2025-06-23 Published:2025-06-23
  • Contact: Kunlun QI E-mail:zym805805@cug.edu.cn;qikunlun@cug.edu.cn
  • About author:ZHAO Yiming (1999—), male, postgraduate, majors in multi-modal remote sensing image fusion. E-mail: zym805805@cug.edu.cn
  • Supported by:
    Hubei Key Research, and Development Program in China(2020AAA004);Hubei Lujia Laboratory Special Fund(220100034)

Abstract:

Deep convolutional neural networks have proven to be one of the most effective methods for scene classification of high-resolution remote sensing images. Most previous studies focus on scene-level classification of single optical remote sensing images and are primarily limited to single-label classification. However, single optical remote sensing images are often constrained by weather conditions, and single-label annotations cannot fully describe complex image contents. Therefore, in this paper, we constructed a multimodal, multi-label scene classification dataset called SEN12-MLRS, using SAR and optical remote sensing images acquired by the European Space Agency in 2020. We proposed a parallel dual attention fusion network (PDANet) for multi-label scene classification. PDANet achieves optical and SAR image feature extraction as well as multi-modal and multilevel feature fusion through two-branch feature extraction, adaptive feature fusion, and multilevel feature fusion. Experimental results demonstrate that PDANet achieves superior performance compared to many state-of-the-art models on the SEN12-MLRS dataset. The effectiveness of the proposed network and its modules is further validated through ablation experiments.

Key words: multi-modal remote sensing image fusion, attention mechanism, multi-label classification, feature fusion

CLC Number: