Acta Geodaetica et Cartographica Sinica ›› 2025, Vol. 54 ›› Issue (6): 1094-1106.doi: 10.11947/j.AGCS.2025.20240439

• Photogrammetry and Remote Sensing • Previous Articles     Next Articles

MAFNet: building extraction method from remote sensing images based on multi-scale atrous fusion network

Zibo DONG1(), Jingxue WANG2(), Lijing BU2, Lin FANG3, Zhenghui XU1   

  1. 1.School of Geomatics, Liaoning Technical University, Fuxin 123000, China
    2.School of Automation and Electronic Information, Xiangtan University, Xiangtan 411105, China
    3.Hexagon Geosystems (Qingdao) Co., Ltd., Qingdao 266114, China
  • Received:2024-10-28 Revised:2025-05-08 Online:2025-07-14 Published:2025-07-14
  • Contact: Jingxue WANG E-mail:472320795@stu.lntu.edu.cn;xiaoxue1861@163.com
  • About author:DONG Zibo (2001—), male, postgraduate, majors in remote sensing image information extraction. E-mail: 472320795@stu.lntu.edu.cn
  • Supported by:
    Natural Science Foundation of Hunan Province(2022JJ30561);Fundamental Applied Research Foundation of Liaoning Province(2022JH2/101300273)

Abstract:

Building extraction from remote sensing images is of great significance to disaster management, urban planning, and change monitoring. Due to the different sizes of urban buildings, there are buildings of multiple spatial scales in a remote sensing image, which makes the accuracy of building extraction in the image insufficient. In order to improve the extraction accuracy of buildings of different scales in the image, this paper proposes a remote sensing image building extraction method using a multi-scale atrous fusion network. Based on the U-Net network, the residual structure is first fused in the encoder and decoder parts to better propagate the gradient during the training process. At the same time, a multi-scale atrous fusion (MAF) module is proposed in the bridge part of the encoder-decoder. This module uses multiple atrous convolutions to capture global context features, and further enhances feature expression through channel and spatial attention mechanisms, effectively improving the extraction accuracy of buildings of different scales in the image. Finally, a hybrid loss function is designed to improve the overall boundary extraction effect. This paper conducts experiments based on the WHU building and Massachusetts building datasets, and compares the proposed method with the current mainstream semantic segmentation network. Experimental results show that the proposed method can significantly improve the accuracy of building extraction in images, can adapt to the extraction of buildings of various sizes, and can extract building boundaries more completely and smoothly.

Key words: remote sensing images, building extraction, U-Net, multi-scale atrous fusion, hybrid loss function

CLC Number: