测绘学报 ›› 2026, Vol. 55 ›› Issue (5): 927-940.doi: 10.11947/j.AGCS.2026.20250521

• 摄影测量学与遥感 • 上一篇    下一篇

面向洪水灾害的视觉-文本协同表征的异质遥感变化检测方法

尉锐1(), 李杰1,2, 刘汇慧1(), 吴美茹1, 林镠鹏3, 袁强强1, 郑莉1   

  1. 1.武汉大学测绘学院,湖北 武汉 430079
    2.湖北珞珈实验室,湖北 武汉 430079
    3.武汉大学资源与环境科学学院,湖北 武汉 430079
  • 收稿日期:2025-12-15 修回日期:2026-04-21 出版日期:2026-06-23 发布日期:2026-06-23
  • 通讯作者: 刘汇慧 E-mail:rui.yu@whu.edu.cn;hhliu@sgg.whu.edu.cn
  • 作者简介:尉锐(2002—),男,硕士生,研究方向为遥感变化检测、深度学习多任务联合。E-mail:rui.yu@whu.edu.cn
  • 基金资助:
    国家自然科学基金(42471504; 42301417)

Heterogeneous remote sensing change detection based on vision-language collaborative representation for flood disasters

Rui YU1(), Jie LI1,2, Huihui LIU1(), Meiru WU1, Liupeng LIN3, Qiangqiang YUAN1, Li ZHENG1   

  1. 1.School of Geodesy and Geomatics, Wuhan University, Wuhan 430079, China
    2.Hubei Luojia Laboratory, Wuhan 430079, China
    3.School of Resource and Environmental Sciences, Wuhan University, Wuhan 430079, China
  • Received:2025-12-15 Revised:2026-04-21 Online:2026-06-23 Published:2026-06-23
  • Contact: Huihui LIU E-mail:rui.yu@whu.edu.cn;hhliu@sgg.whu.edu.cn
  • About author:YU Rui (2002—), male, postgraduate, majors in remote sensing change detection and deep learning-based multi-task learning. E-mail: rui.yu@whu.edu.cn
  • Supported by:
    The National Natural Science Foundation of China(42471504; 42301417)

摘要:

光学与SAR影像的异质变化检测在灾害应急响应与全天候监测中具有重要意义。然而,二者成像机理的显著差异导致特征分布不一致,加之缺乏标注样本与文本描述,制约了传统方法及现有深度学习模型的检测性能。为此,本文提出了一种多维度变化特征增强的CLIP变化检测网络(MCE-CLIP),聚焦于解决洪水灾害场景下的异质影像变化检测难题。该网络构建了基于“SAR影像迁移-文本生成”的跨模态语义引导机制,有效缩小了异质影像间的语义鸿沟;同时设计了伪孪生视觉特征提取分支和多维度变化特征增强模块(MCFEM),通过嵌入模态适配器降低遥感影像的域分布差异,结合时序交叉注意力、多粒度差分及混合相似度投影构建变化特征增强模块,实现对时空上下文信息的高效整合。在两个典型异质数据集上的试验结果表明,MCE-CLIP在F1值和IoU等核心指标上优于现有主流异质变化检测方法。

关键词: 变化检测, 异质遥感影像, 视觉语言模型, 多模态融合, SAR

Abstract:

Heterogeneous change detection using optical and SAR imagery is of great significance for disaster emergency response and all-weather monitoring. However, the significant differences in their imaging mechanisms lead to inconsistent feature distributions, which, coupled with the lack of annotated samples and textual descriptions, restrict the detection performance of traditional methods and existing deep learning models. To this end, this paper proposes a multi-dimensional change enhancement CLIP change detection network (MCE-CLIP), aiming to tackle the challenges of heterogeneous image change detection in flood disaster scenarios. The network constructs a cross-modal semantic guidance mechanism based on “SAR image transfer-text generation”, effectively narrowing the semantic gap between heterogeneous images. Meanwhile, a pseudo-siamese visual feature extraction branch and a multi-dimensional change feature enhancement module (MCFEM) are designed. By embedding modality adapters, the domain distribution discrepancy of remote sensing images is reduced. Furthermore, the MCFEM is constructed by integrating temporal cross-attention, multi-granularity differencing, and hybrid similarity projection, achieving the efficient integration of spatiotemporal contextual information. Experimental results on two typical heterogeneous datasets demonstrate that MCE-CLIP outperforms existing mainstream heterogeneous change detection methods in core evaluation metrics such as F1 score and intersection over union.

Key words: change detection, heterogeneous remote sensing imagery, vision-language model, multi-modal fusion, SAR

中图分类号: