测绘学报 ›› 2025, Vol. 54 ›› Issue (1): 154-164.doi: 10.11947/j.AGCS.2025.20240067

• 摄影测量学与遥感 • 上一篇    

面向城市功能区分类的光学遥感影像-OSM数据联合自监督学习方法

李佳铃1(), 齐霁1,2,3, 鲁伟鹏4, 陶超1()   

  1. 1.中南大学地球科学与信息物理学院,湖南 长沙 410083
    2.广州大学地理科学与遥感学院,广东 广州 510006
    3.广州大学黄埔研究院,广东 广州 510000
    4.香港理工大学土地测量及地理资讯学系,香港 999077
  • 收稿日期:2024-02-17 修回日期:2024-11-30 发布日期:2025-02-17
  • 通讯作者: 陶超 E-mail:235011028@csu.edu.cn;kingtaochao@126.com
  • 作者简介:李佳铃(2000—),女,硕士生,研究方向为高分辨率遥感影像智能解译,图像-语言模型驱动的遥感城市功能区分类。 E-mail:235011028@csu.edu.cn
  • 基金资助:
    湖南省杰出青年基金(2022JJ10072);国家自然科学基金(42171376)

Self-supervised learning based urban functional zone classification by integrating optical remote sensing image-OSM data

Jialing LI1(), Ji QI1,2,3, Weipeng LU4, Chao TAO1()   

  1. 1.School of Geosciences and Info-Physics, Central South University, Changsha 410083, China
    2.School of Geography and Remote Sensing, Guangzhou University, Guangzhou 510006, China
    3.Huangpu Research School of Guangzhou University, Guangzhou 510000, China
    4.Department of Land Surveying and Geo-Informatics, The Hong Kong Polytechnic University, Hong Kong 999077, China
  • Received:2024-02-17 Revised:2024-11-30 Published:2025-02-17
  • Contact: Chao TAO E-mail:235011028@csu.edu.cn;kingtaochao@126.com
  • About author:LI Jialing (2000—), female, postgraduate, majors in semantic segmentation of high resolution remote sensing images based on deep learning, image-language model-driven remote sensing urban functional zone classification. E-mail: 235011028@csu.edu.cn
  • Supported by:
    The Natural Science Foundation of Hunan for Distinguished Young Scholars(2022JJ10072);The National Natural Science Foundation of China(42171376)

摘要:

城市功能区的快速准确分类为城市规划和管理提供科学依据,有助于实现城市的可持续发展。尽管光学遥感影像提供了丰富的视觉信息,但其无法充分反映社会属性,易引发语义歧义。因此更多研究尝试联合使用包含城市社会属性的数据(如OSM数据)和光学遥感影像以期达到互补效果。但这一思路面临两个主要挑战:一是光学影像与OSM数据存在数据结构差异,传统的融合方法在特征提取阶段缺乏充分交互融合,导致模型难以充分学习数据之间的互补优势;二是随着模型学习使用的数据模态增多,训练一个稳定的模型需要更多的人工标注数据,但这显著提高了城市功能区分类模型应用的人力成本。针对上述问题,本文提出了一种面向城市功能区分类的光学遥感影像-OSM数据联合自监督学习方法。一方面,将OSM数据与光学影像在空间分布、数据结构等方面进行统一,然后在统一的多模态融合编码架构中进行特征提取和交互融合,以学习跨模态通用性表征。另一方面,采用自监督模型在大规模无标注数据上预训练,再通过少量标注数据将模型迁移到特定城市功能区分类任务中,从而减少人工成本。本文通过在北京、洛杉矶和伦敦3个大尺度区域进行城市功能区分类试验,证明了本文方法较现有主流方法的性能优势。

关键词: 城市功能区, 光学遥感影像, OSM, 自监督学习, 多模态

Abstract:

Rapid and accurate classification of urban functional zones (UFZs) provides a scientific basis for urban planning and management and helps to realize sustainable urban development. Although optical remote sensing images provide rich visual information, they cannot fully reflect social attributes and are prone to semantic ambiguity. Therefore, more studies have tried to jointly use data containing urban social attributes (e.g., OSM data) and optical remote sensing images to achieve complementary effects. However, this idea faces two main challenges: first, there are data structure differences between optical images and OSM data, and traditional fusion methods lack sufficient interaction and fusion in the feature extraction stage, which makes it difficult for the model to fully learn the complementary advantages between the data. Second, with the increase of data modalities used for model learning, more manually labeled data are required to train a stable model, but this significantly increases the labor cost of UFZ classification model application. In response to the above problems, this paper proposes a self-supervised learning based urban functional zone classification method by integrating optical remote sensing image-OSM data. On the one hand, OSM data are unified with optical images in terms of spatial distribution and data structure, and then feature extraction and interactive fusion are carried out in a unified multimodal fusion coding architecture to learn cross-modal generalized representations. On the other hand, in this paper, a self-supervised model is used to pre-train on large-scale unlabeled data, and then a small amount of labeled data is used to transfer the model to a specific UFZ classification task, thus reducing the labor cost. The performance advantages of this paper's method over existing mainstream methods are demonstrated by conducting UFZ classification experiments in three large-scale regions, Beijing, Los Angeles and London.

Key words: urban functional zone, optical remote sensing images, OSM, self-supervised learning, multimodality

中图分类号: