基于多尺度跨模态特征融合的异源遥感影像洪水变化检测

doi:10.11947/j.AGCS.2026.20250331

摘要/Abstract

摘要：

针对现有端到端异源变化检测方法未能有效顾及不同模态特征差异，且难以兼顾局部细节与全局语义信息等问题，本文提出一种基于多尺度跨模态特征融合的异源遥感影像变化检测网络。首先，基于编码器-解码器架构，在编码部分利用遥感基础模型构建多模态影像多尺度特征表达，为增强多尺度特征纹理结构信息，引入特征增强模块，通过瓶颈结构多尺度卷积设计有效增强不同模态特征细节信息，同时抑制噪声干扰。其次，为有效顾及不同模态特征差异、实现浅层异构特征高效融合，引入选择性跨模态融合模块，通过学习动态权重实现多模态特征自适应融合，有效捕捉模态间互补信息，提升融合特征稳健性与表达能力。然后，为有效建模深层异构特征时空上下文信息，引入跨模态交叉注意力融合模块，通过空间注意力机制和通道注意力机制有效捕捉不同模态特征时空关联，显著增强融合特征稳健性与可靠性。最后，提出自适应上采样模块实现编码器-解码器特征的对齐与融合，有效弥补解码过程中细节信息缺失、积聚变化信息，并通过三层卷积和上采样模块组成的变化头生成变化图。为验证本文方法的有效性，使用CAU-Flood和Ombria两个大规模洪水变化检测数据集进行试验。结果表明，与传统方法相比，本文方法在两个数据集上均取得最优精度指标，同时变化检测漏检率和虚检率显著降低，取得了最优目视效果。消融试验进一步验证了MHCDNet各模块的有效性，模型复杂度分析表明，MHCDNet计算复杂度较低，取得了精度与效率的最佳平衡。

关键词: 异源变化检测, 特征增强, 选择性跨模态融合, 跨模态交叉注意力融合, 自适应上采样

Abstract:

To address the limitations in existing end-to-end heterogeneous change detection methods, which often neglect modality-specific feature differences and struggle to balance local details with global semantics, this paper introduces a multi-scale heterogeneous change detection network (MHCDNet) featuring cross-modal fusion for heterogeneous remote sensing imagery, which is built upon an encoder-decoder architecture. In the encoding part, a remote sensing foundation model is utilized to construct multi-scale feature representations for multi-modal images. To enhance the textural and structural information, a feature enhancement module (FEM) is introduced, which employs a bottleneck structure with multi-scale convolution design to effectively enhance detail information in different modal features while suppressing noise interference. Furthermore, to effectively account for the differences in multimodal features and achieve efficient fusion of shallow heterogeneous features, a selective cross-modal fusion module (SCFM) is introduced, which learns dynamic weights to enable adaptive fusion of multi-modal features, effectively capturing complementary information between modalities, thereby enhancing the robustness and representational capacity of fused features. Additionally, to effectively model the spatiotemporal context of deep heterogeneous features, a cross-modal cross-attention fusion module (CCFM) is introduced, which leverages both spatial and channel attention mechanisms to capture inter-modal spatiotemporal correlations, significantly enhancing the robustness and reliability of fused features. Finally, an adaptive up-sampling module (AUM) is proposed to achieve alignment and fusion of encoder-decoder features, effectively compensating for the loss of detail information during the decoding process, accumulating the change information, and generating change maps through a change head composed of three convolutional layers and up-sampling modules. To verify the effectiveness of the proposed method, experiments are conducted on two large-scale flood change detection datasets, CAU-Flood and Ombria. The results demonstrate that compared with other methods, MHCDNet achieves the best accuracy metrics on both datasets, while significantly reducing the false alarms and missed detections in change detection, yielding optimal visual results. Furthermore, ablation studies further verify the effectiveness of each module in MHCDNet. Model complexity analysis demonstrates that MHCDNet possesses low computational complexity, achieving the best balance between accuracy and efficiency.

Key words: heterogeneous change detection, feature enhancement, selective cross-modal fusion, cross-modal cross-attention fusion, adaptive up-sampling

中图分类号:

P237

彭代锋, 刘雪莲, 鲁梦飞, 管海燕. 基于多尺度跨模态特征融合的异源遥感影像洪水变化检测[J]. 测绘学报, 2026, 55(2): 328-343.

Daifeng PENG, Xuelian LIU, Mengfei LU, Haiyan GUAN. Heterogeneous remote sensing image flood change detection based on multi-scale cross-modal feature fusion[J]. Acta Geodaetica et Cartographica Sinica, 2026, 55(2): 328-343.

图/表 16

图1

图2

图3

图4

图5

图6

表1

表2

表3

图7

图8

表4

表5

图9

图10

图11

参考文献 32

[1]	BRUZZONE L, BOVOLO F. A novel framework for the design of change-detection systems for very-high-resolution remote sensing images[J]. Proceedings of the IEEE, 2013, 101(3): 609-630.
[2]	PATEL A, VYAS D, CHAUDHARI N, et al. Novel approach for the LULC change detection using GIS & Google Earth Engine through spatiotemporal analysis to evaluate the urbanization growth of Ahmedabad city[J]. Results in Engineering, 2024, 21: 101788.
[3]	QING Yuanzhao, MING Dongping, WEN Qi, et al. Operational earthquake-induced building damage assessment using CNN-based direct remote sensing change detection on superpixel level[J]. International Journal of Applied Earth Observation Geoinformation, 2022, 112: 102899.
[4]	WANG Nan, LI Wei, TAO Ran, et al. Graph-based block-level urban change detection using Sentinel-2 time series[J]. Remote Sensing of Environment, 2022, 274: 112993.
[5]	陈立福, 金昱忱, 李振洪, 等. 基于多特征交叉融合孪生网络的SAR影像地震滑坡识别[J]. 武汉大学学报(信息科学版), 2025, 50(5): 917-927.
	CHEN Lifu, JIN Yuchen, LI Zhenhong, et al. SAR image earthquake landslide recognition based on multi-featurecross-fused siamese network[J]. Geomatics and Information Science of Wuhan University, 2025, 50(5): 917-927.
[6]	赵金奇, 李宇轩, 刘子蓉, 等. 基于相似性衡量函数优化的SAR时空极化信息一体化洪涝变化检测方法[J]. 测绘学报, 2024, 53(12): 2375-2390. DOI: . doi: 10.11947/j.AGCS.2024.20230355
	ZHAO Jinqi, LI Yuxuan, LIU Zirong, et al. Flood change detection method using optimized similarity measurement function with temporal-spatial-polarized SAR information[J]. Acta Geodaetica et Cartographica Sinica, 2024, 53(12): 2375-2390. DOI: . doi: 10.11947/j.AGCS.2024.20230355
[7]	MERCIER G, MOSER G, SERPICO S B. Conditional copulas for change detection in heterogeneous remote sensing images[J]. IEEE Transactions on Geoscience Remote Sensing, 2008, 46(5): 1428-1441.
[8]	PRENDES J, CHABERT M, PASCAL F, et al. A new multivariate statistical model for change detection in images acquired by homogeneous and heterogeneous sensors[J]. IEEE Transactions on Image Processing, 2014, 24(3): 799-812.
[9]	TOUATI R, MIGNOTTE M, DAHMANE M. Multimodal change detection in remote sensing images using an unsupervised pixel pair-wise-based Markov random field model[J]. IEEE Transactions on Image Processing, 2019, 29: 757-767.
[10]	ZHOU Weiqi, TROY A, GROVE M. Object-based land cover classification and change analysis in the Baltimore metropolitan area using multitemporal high resolution remote sensing data[J]. Sensors, 2008, 8(3): 1613-1636.
[11]	WAN Ling, XIANG Yuming, YOU Hongjian. A post-classification comparison method for SAR and optical images change detection[J]. IEEE Geoscience Remote Sensing Letters, 2019, 16(7): 1026-1030.
[12]	ALBERGA V. Similarity measures of remotely sensed multi-sensor images for change detection applications[J]. Remote Sensing, 2009, 1(3): 122-143.
[13]	KWAN C, AYHAN B, LARKIN J, et al. Performance of change detection algorithms using heterogeneous images and extended multi-attribute profiles (EMAPs)[J]. Remote Sensing, 2019, 11(20): 2377.
[14]	燕琴, 顾海燕, 杨懿, 等. 智能遥感大模型研究进展与发展方向[J]. 测绘学报, 2024, 53(10): 1967-1980. DOI: . doi: 10.11947/j.AGCS.2024.20240053
	YAN Qin, GU Haiyan, YANG Yi, et al. Research progress and trend of intelligent remote sensing large model[J]. Acta Geodaetica et Cartographica Sinica, 2024, 53(10): 1967-1980. DOI: . doi: 10.11947/j.AGCS.2024.20240053
[15]	LI Xinghua, DU Zhengshun, HUANG Yanyuan, et al. A deep translation (GAN) based change detection network for optical and SAR remote sensing images[J]. ISPRS Journal of Photogrammetry Remote Sensing, 2021, 179: 14-34.
[16]	DU Zhengshun, LI Xinghua, MIAO Jianhao, et al. Concatenated deep learning framework for multi-task change detection of optical and SAR images[J]. IEEE Journal of Selected Topics in Applied Earth Observations Remote Sensing, 2023, 17: 719-731.
[17]	刘秦森, 孙帮勇. 光学信号Token引导的异源遥感变化检测网络[J]. 遥感学报, 2024, 28(1): 88-104.
	LIU Qinsen, SUN Bangyong. Optical-signal Token guided change detection network for heterogeneous remote sensing image[J]. National Remote Sensing Bulletin, 2024, 28(1): 88-104.
[18]	HAFNER S, NASCETTI A, AZIZPOUR H, et al. Sentinel-1 and Sentinel-2 data fusion for urban change detection using a dual stream U-Net[J]. IEEE Geoscience Remote Sensing Letters, 2021, 19: 1-5.
[19]	LI Haoyang, ZHU Fangjie, ZHENG Xiaoyu, et al. MSCDUNet: a deep learning framework for built-up area change detection integrating multispectral, SAR, and VHR data[J]. IEEE Journal of Selected Topics in Applied Earth Observations Remote Sensing, 2022, 15: 5163-5176.
[20]	HE Xiaoning, ZHANG Shuangcheng, XUE Bowei, et al. Cross-modal change detection flood extraction based on convolutional neural network[J]. International Journal of Applied Earth Observation Geoinformation, 2023, 117: 103197.
[21]	ZHANG Chenxiao, FENG Yukang, HU Lei, et al. A domain adaptation neural network for change detection with heterogeneous optical and SAR remote sensing images[J]. International Journal of Applied Earth Observation Geoinformation, 2022, 109: 102769.
[22]	DENG Kai, HU Xiangyun, ZHANG Zhili, et al. Cross-modal change detection using historical land use maps and current remote sensing images[J]. ISPRS Journal of Photogrammetry Remote Sensing, 2024, 218: 114-132.
[23]	CHENG Wei, FENG Yining, SONG Liyang, et al. DMF2Net: dynamic multi-level feature fusion network for heterogeneous remote sensing image change detection[J]. Knowledge-based Systems, 2024, 300: 112159.
[24]	WANG D, MA Guorui, ZHANG Haiming, et al. Refined change detection in heterogeneous low-resolution remote sensing images for disaster emergency response[J]. ISPRS Journal of Photogrammetry and Remote Sensing, 2025, 220: 139-155.
[25]	SHEN Yu, YAO Shuang, QIANG Zhenkai, et al. SD-Mamba: a lightweight synthetic-decompression network for cross-modal flood change detection[J]. International Journal of Applied Earth Observation and Geoinformation, 2025, 136: 104409.
[26]	WANG Yi, ALBRECHT C M, BRAHAM N A A, et al. Decoupling common and unique representations for multimodal self-supervised learning[C]//Proceedings of 2024 European Conference on Computer Vision. Cham: Springer International Publishing, 2024: 286-303.
[27]	WU Kang, ZHANG Yingying, RU Lixiang, et al. A semantic-enhanced multi-modal remote sensing foundation model for Earth observation[J]. Nature Machine Intelligence, 2025, 7: 1235-1249.
[28]	DRAKONAKIS G I, TSAGKATAKIS G, FOTIADOU K, et al. Ombrianet—supervised flood mapping via convolutional neural networks using multitemporal Sentinel-1 and Sentinel-2 data fusion[J]. IEEE Journal of Selected Topics in Applied Earth Observations Remote Sensing, 2022, 15: 2341-2356.
[29]	CHEN Hao, QI Zipeng, SHI Zhenwei. Remote sensing image change detection with transformers[J]. IEEE Transactions on Geoscience Remote Sensing, 2021, 60: 5607514.
[30]	FENG Yuchao, JIANG Jiawei, XU Honghui, et al. Change detection on remote sensing images using dual-branch multilevel intertemporal network[J]. IEEE Transactions on Geoscience Remote Sensing, 2023, 61: 4401015.
[31]	MA Xianping, ZHANG Xiaokang, PUN M O, et al. A multilevel multimodal fusion Transformer for remote sensing semantic segmentation[J]. IEEE Transactions on Geoscience Remote Sensing, 2024, 62: 5403215.
[32]	MA Xianping, ZHANG Xiaokang, PUN M O. A crossmodal multiscale fusion network for semantic segmentation of remote sensing data[J]. IEEE Journal of Selected Topics in Applied Earth Observations Remote Sensing, 2022, 15: 3463-3474.

数据集	变化类型	总数量	训练集	验证集	测试集	光学来源	光学影像通道数	SAR来源	SAR通道数
CAU-Flood	洪水	18 302	12 811	1830	3661	Sentinel-2	4	Sentinel-1	1
Ombria	洪水	844	590	84	170	Sentinel-2	3	Sentinel-1	1

α	β	CAU-Flood						Ombria
α	β	OA	P	R	F₁值	IoU	Kappa系数	OA	P	R	F₁值	IoU	Kappa系数
1	0	98.77	91.98	90.60	91.28	83.97	90.60	88.84	80.27	83.15	81.68	69.04	73.62
0	1	98.77	92.40	90.11	91.24	83.90	90.57	89.12	80.78	83.50	82.12	69.66	74.30
0.5	0.5	98.78	91.70	91.13	91.41	84.19	90.76	88.71	80.23	82.63	81.42	68.66	73.24
0.5	1	98.78	92.30	90.38	91.33	84.05	90.66	89.24	82.07	81.95	82.04	69.51	74.31
1	0.5	98.77	91.61	91.00	91.30	84.00	90.58	88.77	79.20	84.73	81.87	69.31	73.72

类型	模型	CAU-Flood						Ombria
类型	模型	OA	P	R	F₁值	IoU	Kappa系数	OA	P	R	F₁值	IoU	Kappa系数
单模态	BIT	97.33	92.00	68.43	78.48	64.59	77.10	83.29	70.29	76.46	73.25	57.79	61.13
单模态	DMINet	97.47	93.54	69.14	79.51	65.99	78.19	85.47	76.05	75.06	75.55	60.71	65.22
多模态	CMCDNet	98.51	90.78	90.00	90.39	80.73	88.54	87.96	78.33	82.65	80.43	67.27	71.75
	FTransUNet	98.31	90.83	87.90	89.34	78.78	87.23	87.68	79.87	78.66	79.26	65.65	70.51
	CMFNet	98.64	88.22	88.04	88.13	82.47	89.66	87.47	77.30	82.29	79.72	66.28	70.67
	SD-Mamba	98.00	88.98	82.04	85.37	74.47	84.30	84.82	71.27	80.32	75.52	60.67	64.16
	MHCDNet	98.78	91.70	91.13	91.41	84.19	90.76	89.12	80.78	83.50	82.12	69.66	74.30

方法	是否多模态模型	F₁值/（%）	参数量/M	FLOPs/G
BIT	×	78.48	3.49	10.60
DMINet	×	79.51	6.24	14.55
CMCDNet	√	90.39	198.74	30.41
FTransUNet	√	89.34	160.88	45.21
CMFNet	√	88.13	78.25	123.63
SD-Mamba	√	85.37	5.34	10.19
MHCDNet	√	91.41	20.94	20.11

FEM	AUM	SCFM	CCFM	CAU-Flood						Ombria
FEM	AUM	SCFM	CCFM	OA	P	R	F₁值	IoU	Kappa系数	OA	P	R	F₁值	IoU	Kappa系数
×	×	×	×	98.63	90.89	89.75	90.32	82.35	89.59	87.07	76.86	81.23	78.99	65.27	69.66
√	×	×	×	98.70	90.96	90.77	90.86	83.26	90.17	87.88	79.19	80.72	79.94	66.59	71.27
√	√	×	×	98.72	91.63	90.25	90.93	83.38	90.25	88.54	80.34	81.68	81.00	68.08	72.80
√	√	√	×	98.76	92.04	90.37	91.20	83.82	90.54	88.92	81.51	81.44	81.48	68.75	73.58
√	√	√	√	98.78	91.70	91.13	91.41	84.19	90.76	89.12	80.07	83.50	82.12	69.66	74.30