Acta Geodaetica et Cartographica Sinica ›› 2025, Vol. 54 ›› Issue (1): 154-164.doi: 10.11947/j.AGCS.2025.20240067

• Photogrammetry and Remote Sensing • Previous Articles    

Self-supervised learning based urban functional zone classification by integrating optical remote sensing image-OSM data

Jialing LI1(), Ji QI1,2,3, Weipeng LU4, Chao TAO1()   

  1. 1.School of Geosciences and Info-Physics, Central South University, Changsha 410083, China
    2.School of Geography and Remote Sensing, Guangzhou University, Guangzhou 510006, China
    3.Huangpu Research School of Guangzhou University, Guangzhou 510000, China
    4.Department of Land Surveying and Geo-Informatics, The Hong Kong Polytechnic University, Hong Kong 999077, China
  • Received:2024-02-17 Revised:2024-11-30 Published:2025-02-17
  • Contact: Chao TAO E-mail:235011028@csu.edu.cn;kingtaochao@126.com
  • About author:LI Jialing (2000—), female, postgraduate, majors in semantic segmentation of high resolution remote sensing images based on deep learning, image-language model-driven remote sensing urban functional zone classification. E-mail: 235011028@csu.edu.cn
  • Supported by:
    The Natural Science Foundation of Hunan for Distinguished Young Scholars(2022JJ10072);The National Natural Science Foundation of China(42171376)

Abstract:

Rapid and accurate classification of urban functional zones (UFZs) provides a scientific basis for urban planning and management and helps to realize sustainable urban development. Although optical remote sensing images provide rich visual information, they cannot fully reflect social attributes and are prone to semantic ambiguity. Therefore, more studies have tried to jointly use data containing urban social attributes (e.g., OSM data) and optical remote sensing images to achieve complementary effects. However, this idea faces two main challenges: first, there are data structure differences between optical images and OSM data, and traditional fusion methods lack sufficient interaction and fusion in the feature extraction stage, which makes it difficult for the model to fully learn the complementary advantages between the data. Second, with the increase of data modalities used for model learning, more manually labeled data are required to train a stable model, but this significantly increases the labor cost of UFZ classification model application. In response to the above problems, this paper proposes a self-supervised learning based urban functional zone classification method by integrating optical remote sensing image-OSM data. On the one hand, OSM data are unified with optical images in terms of spatial distribution and data structure, and then feature extraction and interactive fusion are carried out in a unified multimodal fusion coding architecture to learn cross-modal generalized representations. On the other hand, in this paper, a self-supervised model is used to pre-train on large-scale unlabeled data, and then a small amount of labeled data is used to transfer the model to a specific UFZ classification task, thus reducing the labor cost. The performance advantages of this paper's method over existing mainstream methods are demonstrated by conducting UFZ classification experiments in three large-scale regions, Beijing, Los Angeles and London.

Key words: urban functional zone, optical remote sensing images, OSM, self-supervised learning, multimodality

CLC Number: