测绘学报 ›› 2024, Vol. 53 ›› Issue (7): 1371-1383.doi: 10.11947/j.AGCS.2024.20230074
收稿日期:
2023-03-16
发布日期:
2024-08-12
通讯作者:
王新
E-mail:wangwei@csust.edu.cn;wangxin@csust.edu.cn
作者简介:
王威(1974—),男,博士,教授,博士生导师,研究方向为计算机视觉和模式识别。E-mail:wangwei@csust.edu.cn
基金资助:
Wei WANG(), Wei ZHENG, Xin WANG()
Received:
2023-03-16
Published:
2024-08-12
Contact:
Xin WANG
E-mail:wangwei@csust.edu.cn;wangxin@csust.edu.cn
About author:
WANG Wei (1974—), male, PhD, professor, PhD supervisor, majors in computer vision and pattern recognition. E-mail: wangwei@csust.edu.cn
Supported by:
摘要:
遥感图像分类过程中,局部信息与全局信息至关重要。目前,遥感图像分类的方法主要包括卷积神经网络(CNN)及Transformer。CNN在局部信息提取方面具有优势,但在全局信息提取方面有一定的局限性。相比之下,Transformer在全局信息提取方面表现出色,但计算复杂度高。为提高遥感图像场景分类性能,降低复杂度,设计了LAG-MANet纯卷积网络。该网络既关注局部特征,又关注全局特征,并且考虑了多尺度特征。输入图像被预处理后,首先采用多分支扩张卷积模块(MBDConv)提取多尺度特征;然后依次进入网络的4个阶段,在每个阶段采用并行双域特征融合模块(P2DF)分支路提取局部、全局特征并进行融合;最后先经过全局平均池化、再经过全连接层输出分类标签。LAG-MANet在WHU-RS19数据集、SIRI-WHU数据集及RSSCN7数据集上的分类准确率分别为97.76%、97.04%、97.18%。试验结果表明,在3个具有挑战性的公开遥感数据集上,LAG-MANet更具有优越性。
中图分类号:
王威, 郑薇, 王新. 面向遥感图像场景分类的LAG-MANet模型[J]. 测绘学报, 2024, 53(7): 1371-1383.
Wei WANG, Wei ZHENG, Xin WANG. LAG-MANet model for remote sensing image scene classification[J]. Acta Geodaetica et Cartographica Sinica, 2024, 53(7): 1371-1383.
表3
拆解P2DF模块试验结果"
模型 | RAC | SAM | MSI | CAM | RPFF | 准确率/(%) |
---|---|---|---|---|---|---|
LAG-MANet-11 | × | √ | √ | √ | √ | 95.03±0.38 |
LAG-MANet-12 | √ | × | √ | √ | √ | 96.61±0.39 |
LAG-MANet-13 | √ | √ | × | √ | √ | 95.96±0.57 |
LAG-MANet-14 | √ | √ | √ | × | √ | 96.18±0.29 |
LAG-MANet-15 | √ | √ | √ | √ | × | 92.43±0.53 |
LAG-MANet-16 | × | × | √ | √ | √ | 93.86±0.55 |
LAG-MANet-17 | √ | √ | × | × | √ | 95.93±0.31 |
LAG-MANet-18 | √ | × | √ | × | √ | 96.32±0.18 |
LAG-MANet-19 | × | √ | × | √ | √ | 95.43±0.18 |
LAG-MANet-20 | × | √ | √ | × | √ | 94.79±0.48 |
LAG-MANet-21 | √ | × | × | √ | √ | 96.14±0.40 |
LAG-MANet | √ | √ | √ | √ | √ | 97.18±0.41 |
表4
WHU-RS19数据集上模型试验结果"
模型 | 准确率 | 精确率 | 召回率 | 特异性 | F1值 |
---|---|---|---|---|---|
ResNet50[ | 97.45±0.32 | 97.71±0.37 | 97.42±0.34 | 99.87±0.02 | 97.43±0.35 |
VGG16[ | 92.25±1.59 | 93.05±1.48 | 92.31±1.59 | 99.57±0.08 | 92.27±1.60 |
EfficientNetV2[ | 97.04±0.20 | 97.21±0.18 | 97.06±0.21 | 99.85±0.01 | 97.03±0.20 |
ConvNext[ | 90.82±1.07 | 91.70±1.00 | 90.92±0.95 | 99.50±0.06 | 90.79±1.03 |
ViT[ | 81.73±1.04 | 83.46±1.57 | 81.82±1.00 | 98.99±0.06 | 81.65±1.06 |
SwinTransformer[ | 92.96±1.09 | 93.63±0.99 | 93.08±1.07 | 99.62±0.06 | 93.02±1.09 |
PoolFormer[ | 93.17±0.89 | 93.73±0.81 | 93.20±0.86 | 99.63±0.05 | 93.18±0.85 |
Hornet[ | 88.88±0.59 | 89.84±1.10 | 89.03±0.59 | 99.39±0.04 | 88.85±0.80 |
MogaNet[ | 96.74±0.25 | 97.03±0.27 | 96.70±0.26 | 99.83±0.01 | 96.71±0.25 |
VAN[ | 96.53±0.82 | 96.79±0.76 | 96.53±0.84 | 99.82±0.05 | 96.50±0.83 |
EMTCAL[ | 88.06±0.79 | 88.93±0.76 | 87.97±0.83 | 99.35±0.04 | 87.89±0.83 |
GCSANet[ | 96.10±0.41 | 96.34±0.41 | 96.07±0.39 | 99.79±0.02 | 96.04±0.42 |
LAG-MANet-Split | 97.86±0.50 | 98.04±0.45 | 97.87±0.51 | 99.89±0.03 | 97.86±0.50 |
LAG-MANet | 97.76±0.25 | 97.93±0.20 | 97.72±0.27 | 99.88±0.01 | 97.72±0.27 |
表5
对比不同模型复杂度"
模型 | 参数量(×106) | 计算量(×106) |
---|---|---|
ResNet50[ | 23.52 | 4 131.71 |
VGG16[ | 134.29 | 15 466.20 |
EfficientNetV2[ | 20.19 | 2 897.32 |
ConvNext[ | 27.80 | 4 454.77 |
ViT[ | 85.65 | 16 862.87 |
SwinTransformer[ | 27.50 | 4 371.13 |
PoolFormer[ | 20.84 | 3 393.76 |
Hornet[ | 21.86 | 3 967.88 |
MogaNet[ | 24.79 | 4 947.57 |
VAN[ | 13.35 | 2 505.09 |
EMTCAL[ | 27.8 | 4 233.93 |
GCSANet[ | 14.16 | 5 677.51 |
LAG-MANet-Split | 5.01 | 1 603.94 |
LAG-MANet | 12.51 | 3 648.77 |
表6
SIRI-WHU数据集上模型试验结果"
模型 | 准确率 | 精确率 | 召回率 | 特异性 | F1值 |
---|---|---|---|---|---|
ResNet50[ | 96.38±0.28 | 96.48±0.27 | 96.38±0.28 | 99.67±0.02 | 96.38±0.28 |
VGG16[ | 93.67±0.98 | 93.82±0.99 | 93.67±0.98 | 99.42±0.09 | 93.66±0.99 |
EfficientNetV2[ | 96.21±0.31 | 96.28±0.29 | 96.21±0.31 | 99.65±0.03 | 96.21±0.31 |
ConvNext[ | 93.54±0.68 | 93.82±0.59 | 93.54±0.68 | 99.42±0.06 | 93.53±0.70 |
ViT[ | 91.58±0.52 | 91.92±0.67 | 91.58±0.52 | 99.23±0.05 | 91.46±0.51 |
SwinTransformer[ | 95.63±0.23 | 95.73±0.23 | 95.62±0.23 | 99.59±0.02 | 95.60±0.23 |
PoolFormer[ | 95.50±0.31 | 95.65±0.33 | 95.50±0.31 | 99.58±0.03 | 95.50±0.32 |
Hornet[ | 92.96±0.87 | 93.19±0.90 | 92.96±0.87 | 99.37±0.08 | 92.97±0.87 |
MogaNet[ | 95.83±0.46 | 95.92±0.48 | 95.83±0.46 | 99.62±0.05 | 95.84±0.46 |
VAN[ | 95.54±0.39 | 95.66±0.34 | 95.54±0.39 | 99.59±0.03 | 95.53±0.40 |
EMTCAL[ | 93.50±0.30 | 93.66±0.30 | 93.50±0.30 | 99.41±0.03 | 93.51±0.30 |
GCSANet[ | 93.90±0.84 | 94.01±0.77 | 93.91±0.84 | 99.45±0.08 | 93.9±0.84 |
LAG-MANet-Split | 96.50±0.24 | 96.61±0.21 | 96.50±0.24 | 99.68±0.02 | 96.49±0.24 |
LAG-MANet | 97.04±0.20 | 97.13±0.18 | 97.04±0.20 | 99.73±0.01 | 97.04±0.20 |
表7
RSSCN7数据集上模型试验结果"
模型 | 准确率 | 精确率 | 召回率 | 特异性 | F1值 |
---|---|---|---|---|---|
ResNet50[ | 95.93±0.41 | 95.95±0.42 | 95.94±0.41 | 99.34±0.07 | 95.93±0.42 |
VGG16[ | 91.43±0.80 | 91.44±0.85 | 91.43±0.80 | 95.58±0.14 | 91.38±0.83 |
EfficientNetV2[ | 95.14±0.21 | 95.16±0.20 | 95.15±0.22 | 99.22±0.03 | 95.13±0.21 |
ConvNext[ | 90.86±1.04 | 90.93±1.03 | 90.86±1.04 | 98.49±0.18 | 90.82±1.04 |
ViT[ | 90.61±0.27 | 90.62±0.23 | 90.61±0.26 | 98.43±0.05 | 90.57±0.27 |
SwinTransformer[ | 94.75±0.62 | 94.83±0.58 | 94.75±0.61 | 99.15±0.11 | 94.73±0.62 |
PoolFormer[ | 94.43±0.52 | 94.48±0.52 | 94.43±0.52 | 99.10±0.09 | 94.41±0.53 |
Hornet[ | 89.43±0.84 | 89.47±0.83 | 89.43±0.85 | 98.23±0.14 | 89.34±0.87 |
MogaNet[ | 95.14±0.37 | 95.17±0.36 | 95.14±0.38 | 99.21±0.06 | 95.13±0.37 |
VAN[ | 94.22±0.33 | 94.27±0.36 | 94.22±0.33 | 99.06±0.06 | 94.19±0.33 |
EMTCAL[ | 93.57±0.25 | 93.76±0.28 | 93.56±0.25 | 98.94±0.04 | 93.61±0.26 |
GCSANet[ | 92.84±0.41 | 92.99±0.41 | 92.85±0.41 | 98.82±0.07 | 92.83±0.41 |
LAG-MANet-Split | 96.07±0.16 | 96.12±0.21 | 96.07±0.17 | 99.37±0.02 | 96.07±0.17 |
LAG-MANet | 97.18±0.41 | 97.20±0.43 | 97.18±0.42 | 99.55±0.07 | 97.18±0.42 |
[1] | 龚健雅, 张觅, 胡翔云, 等. 智能遥感深度学习框架与模型设计[J]. 测绘学报, 2022, 51(4):475-487. DOI: 10.11947/j.AGCS.2022.20220027. |
GONG Jianya, ZHANG Mi, HU Xiangyun, et al. The design of deep learning framework and model for intelligent remote sensing[J]. Acta Geodaetica et Cartographica Sinica, 2022, 51(4):475-487. DOI: 10.11947/j.AGCS.2022.20220027. | |
[2] | 吴琼, 葛大庆, 于峻川, 等. 广域滑坡灾害隐患InSAR显著性形变区深度学习识别技术[J]. 测绘学报, 2022, 51(10):2046-2055. DOI: 10.11947/j.AGCS.2022.20220303. |
WU Qiong, GE Daqing, YU Junchuan, et al. Deep learning identification technology of InSAR significant deformation zone of potential landslide hazard at large scale[J]. Acta Geodaetica et Cartographica Sinica, 2022, 51(10):2046-2055. DOI: 10.11947/j.AGCS.2022.20220303. | |
[3] | CHENG Gong, XIE Xingxing, HAN Junwei, et al. Remote sensing image scene classification meets deep learning: challenges, methods, benchmarks, and opportunities[J]. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 2020, 13:3735-3756. |
[4] | CHEN Leiyu, LI Shaobo, BAI Qiang, et al. Review of image classification algorithms based on convolutional neural networks[J]. Remote Sensing, 2021, 13(22):4712. |
[5] | YUAN Yuan, FANG Jie, LU Xiaoqiang, et al. Remote sensing image scene classification using rearranged local features[J]. IEEE Transactions on Geoscience and Remote Sensing, 2019, 57(3):1779-1792. |
[6] | AKODAD S, BOMBRUN L, XIA Junshi, et al. Ensemble learning approaches based on covariance pooling of CNN features for high resolution remote sensing scene classification[J]. Remote Sensing, 2020, 12(20):3292. |
[7] | TONG Wei, CHEN Weitao, HAN Wei, et al. Channel-attention-based DenseNet network for remote sensing image scene classification[J]. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 2020, 13:4121-4132. |
[8] | HU Jie, SHEN Li, SUN Gang. Squeeze-and-excitation networks[C]//Proceedings of 2018 IEEE conference on computer vision and pattern recognition. Salt Lake City: IEEE, 2018: 7132-7141. |
[9] | ZHANG Guokai, XU Weizhe, ZHAO Wei, et al. A multiscale attention network for remote sensing scene images classification[J]. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 2021, 14:9530-9545. |
[10] | VASWANI A, SHAZEER N, PARMAR N, et al. Attention is all you need[J]. Advances in neural Information processing systems, 2017, 30:6000-6010. |
[11] | DOSOVITSKIY A, BEYER L, KOLESNIKOV A, et al. An image is worth 16x16 words: transformers for image recognition at scale[C]//Proceedings of the 9th International Conference on Learning Representations. Virtual Event: IEEE, 2020: 1-12. |
[12] | BAZI Y, BASHMAL L, AL RAHHAL M M, et al. Vision transformers for remote sensing image classification[J]. Remote Sensing, 2021, 13(3):516. |
[13] | 王威, 邓纪伟, 王新, 等. 面向遥感图像场景分类的GLFFNet模型[J]. 测绘学报, 2023, 52(10):1693-1702. DOI: 10.11947/j.AGCS.2023.20220286. |
WANG Wei, DENG Jiwei, WANG Xin, et al. GLFFNet model for remote sensing image scene classification[J]. Acta Geodaetica et Cartographica Sinica, 2023, 52(10):1693-1702. DOI: 10.11947/j.AGCS.2023.20220286. | |
[14] | WANG Wei, HU Ting, WANG Xin, et al. BFRNet: bidimensional feature representation network for remote sensing images classification[J]. IEEE Transactions on Geoscience and Remote Sensing, 2023, 61:3313800. |
[15] | DENG Peifang, XU Kejie, HUANG Hong. When CNNs meet vision transformer: a joint framework for remote sensing scene classification[J]. IEEE Geoscience and Remote Sensing Letters, 2022, 19:3109061. |
[16] | RAO Yongming, ZHAO Wenliang, TANG Yansong, et al. Hornet: efficient high-order spatial interactions with recursive gated convolutions[C]//Proceedings of2022 Advances in Neural Information Processing Systems. [S.l.]: IEEE, 2022. |
[17] | DING Xiaohan, ZHANG Xiangyu, HAN Jungong, et al. Scaling up your kernels to 31x31: revisiting large kernel design in cnns[C]//Proceedings of 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New Orleans: IEEE, 2022: 11963-11975. |
[18] | GUO Menghao, LU Chengze, LIU Zhengning, et al. Visual attention network[J]. Computational Visual Media, 2023, 9(4):733-752. |
[19] | WOO S, PARK J, Lee J Y, et al. CBAM: convolutional block attention module[C]//Proceedings of 2018 European conference on computer vision. Cham: Springer, 2018: 3-19. |
[20] | LIU Zhuang, MAO Hanzi, WU Chaoyuan, et al. A ConvNet for the 2020s[C]//Proceedings of 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New Orleans: IEEE, 2022: 11966-11976. |
[21] | XIA G S, YANG Wen, DELON J, et al. Structural high-resolution satellite image indexing[C]//Proceedings of 2010 ISPRS TC VII Symposium. [S.l.]: ISPRS, 2010: 298-303. |
[22] | ZHU Qiqi, ZHONG Yanfei, ZHAO Bei, et al. Bag-of-visual-words scene classifier with local and global features for high spatial resolution remote sensing imagery[J]. IEEE Geoscience and Remote Sensing Letters, 2016, 13(6):747-751. |
[23] | ZOU Qin, NI Lihao, ZHANG Tong, et al. Deep learning based feature selection for remote sensing scene classification[J]. IEEE Geoscience and Remote Sensing Letters, 2015, 12(11):2321-2325. |
[24] | IOFFE S, SZEGEDY C. Batch normalization: accelerating deep network training by reducing internal covariate shift[C]//Proceedings of the 32nd International Conference on International Conference on Machine Learning. Lille: ACM Press, 2015: 448-456. |
[25] | KINGMA D P, BA J. Adam: a method for stochastic optimization[C]//Proceedings of the 3rd International Conference for Learning Representations. [S.l.]: IEEE, 2015: 1-13. |
[26] | SIMONYAN K, ZISSERMAN A. Very deep convolutional networks for large-scale image recognition[C]//Proceedings of the 3rd International Conference for Learning Representations. [S.l.]: IEEE, 2015: 463-476. |
[27] | LIU Ze, LIN Yutong, CAO Yue, et al. SwinTransformer: hierarchical vision transformer using shifted windows[C]//Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision. Montreal: IEEE, 2021: 10012-10022. |
[28] | HE Kaiming, ZHANG Xiangyu, REN Shaoqing, et al. Deep residual learning for image recognition[C]//Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas: IEEE, 2016: 770-778. |
[29] | TAN M, LE Q. EfficientNetV2: smaller models and faster training[C]//Proceedings of 2021 International Conference on Machine Learning. BALTIMORE: IEEE, 2021: 10096-10106. |
[30] | YU Weihao, LUO Mi, ZHOU Pan, et al. MetaFormer is actually what you need for vision[C]//Proceedings of 2022 IEEE/CVF conference on computer vision and pattern recognition. New Orleans: IEEE, 2022: 10819-10829. |
[31] | LI Siyuan, WANG Zedong, LIU Zicheng, et al. Efficient multi-order gated aggregation network[EB/OL]. [2022-11-07]. https://arxiv.org/abs/2211.03295. |
[32] | TANG Xu, LI Mingteng, MA Jingjing, et al. EMTCAL: efficient multiscale transformer and cross-level attention learning for remote sensing scene classification[J]. IEEE Transactions on Geoscience and Remote Sensing, 2022, 60:3194505. |
[33] | CHEN Weitao, OUYANG Shubing, TONG Wei, et al. GCSANet: a global context spatial attention deep learning network for remote sensing scene classification[J]. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 2022, 15:1150-1162. |
[34] | SELVARAJU R R, COGSWELL M, DAS A, et al. Grad-CAM: visual explanations from deep networks via gradient-based localization[J]. International Journal of Computer Vision, 2020, 128(2):336-359. |
[1] | 李彦胜, 吴敏郎, 张永军. 知识图谱约束深度网络的高分辨率遥感影像场景分类[J]. 测绘学报, 2024, 53(4): 677-688. |
[2] | 曹帆之, 石添鑫, 韩开杨, 汪璞, 安玮. 多模态遥感图像模板匹配Log-Gabor滤波方法[J]. 测绘学报, 2024, 53(3): 526-536. |
[3] | 连仁包, 张振敏, 廖一鹏, 邹长忠, 黄立勤. 结合测地距离场与曲线平滑的遥感图像道路中心线快速提取[J]. 测绘学报, 2023, 52(8): 1317-1329. |
[4] | 马梦锴, 董箭, 唐露露, 彭认灿, 周寅飞, 王芳. 兼顾要素正确分类及精准定位的栅格海图水深注记自动提取方法[J]. 测绘学报, 2023, 52(6): 1022-1036. |
[5] | 余东行, 徐青, 赵传, 郭海涛, 卢俊, 林雨准, 刘相云. 注意力引导特征融合与联合学习的遥感影像场景分类[J]. 测绘学报, 2023, 52(4): 624-637. |
[6] | 刘帅, 李笑迎, 于梦, 邢光龙. 高分辨率遥感图像双解耦语义分割网络模型[J]. 测绘学报, 2023, 52(4): 638-647. |
[7] | 沈秭扬, 倪欢, 管海燕. 遥感图像跨域语义分割的无监督域自适应对齐方法[J]. 测绘学报, 2023, 52(12): 2115-2126. |
[8] | 龚循强, 侯昭阳, 吕开云, 鲁铁定, 夏元平, 李威俊. 结合改进Laplacian能量和参数自适应双通道ULPCNN的遥感影像融合方法[J]. 测绘学报, 2023, 52(11): 1892-1905. |
[9] | 王威, 邓纪伟, 王新, 李智勇, 袁平. 面向遥感图像场景分类的GLFFNet模型[J]. 测绘学报, 2023, 52(10): 1693-1702. |
[10] | 白坤, 慕晓冬, 陈雪冰, 朱永清, 尤轩昂. 融合半监督学习的无监督遥感影像场景分类[J]. 测绘学报, 2022, 51(5): 691-702. |
[11] | 叶发茂, 孟祥龙, 董萌, 聂运菊, 葛芸, 陈晓勇. 遥感图像蚁群算法和加权图像到类距离检索法[J]. 测绘学报, 2021, 50(5): 612-620. |
[12] | 施慧慧, 徐雁南, 滕文秀, 王妮. 高分辨率遥感影像深度迁移可变形卷积的场景分类法[J]. 测绘学报, 2021, 50(5): 652-663. |
[13] | 邓晨, 李宏伟, 张斌, 许智宾, 肖志远. 基于深度学习的语义SLAM关键帧图像处理[J]. 测绘学报, 2021, 50(11): 1605-1616. |
[14] | 傅琛, 黄升钶, 汤焱, 吴杭彬, 刘春, 姚连璧, 黄炜. 结合行驶场景语义的轨迹-路网实时匹配方法[J]. 测绘学报, 2021, 50(11): 1617-1627. |
[15] | 郑鑫, 潘斌, 张健. 可变形网络与迁移学习相结合的电力塔遥感影像目标检测法[J]. 测绘学报, 2020, 49(8): 1042-1050. |
阅读次数 | ||||||
全文 |
|
|||||
摘要 |
|
|||||