Acta Geodaetica et Cartographica Sinica ›› 2025, Vol. 54 ›› Issue (6): 1094-1106.doi: 10.11947/j.AGCS.2025.20240439
• Photogrammetry and Remote Sensing • Previous Articles Next Articles
Zibo DONG1(
), Jingxue WANG2(
), Lijing BU2, Lin FANG3, Zhenghui XU1
Received:2024-10-28
Revised:2025-05-08
Online:2025-07-14
Published:2025-07-14
Contact:
Jingxue WANG
E-mail:472320795@stu.lntu.edu.cn;xiaoxue1861@163.com
About author:DONG Zibo (2001—), male, postgraduate, majors in remote sensing image information extraction. E-mail: 472320795@stu.lntu.edu.cn
Supported by:CLC Number:
Zibo DONG, Jingxue WANG, Lijing BU, Lin FANG, Zhenghui XU. MAFNet: building extraction method from remote sensing images based on multi-scale atrous fusion network[J]. Acta Geodaetica et Cartographica Sinica, 2025, 54(6): 1094-1106.
Tab. 1
Comparison of segmentation accuracy of different networks on the WHU building dataset"
| 模型 | IoU | Accuracy | Precision | Recall | F1值 |
|---|---|---|---|---|---|
| FCN-8s | 79.71 | 95.43 | 88.68 | 89.76 | 88.60 |
| U-Net | 77.47 | 95.36 | 88.36 | 87.46 | 87.90 |
| DeepLabV3+ | 86.07 | 96.48 | 92.28 | 90.05 | 91.15 |
| SegNet | 82.70 | 96.95 | 91.39 | 89.74 | 90.55 |
| BuildFormer | 85.96 | 97.03 | 89.51 | 89.45 | 89.47 |
| HD-Net | 84.59 | 96.41 | 89.03 | 88.89 | 88.96 |
| SDSC-UNet | 86.04 | 97.24 | 91.97 | 90.88 | 91.42 |
| MAFNet | 88.01 | 97.55 | 92.38 | 93.22 | 92.79 |
Tab. 2
Comparison of segmentation accuracy of different networks on the Massachusetts building dataset"
| 模型 | IoU | Accuracy | Precision | Recall | F1值 |
|---|---|---|---|---|---|
| FCN-8s | 71.82 | 93.96 | 81.44 | 80.94 | 81.18 |
| U-Net | 73.43 | 94.12 | 82.35 | 83.66 | 82.99 |
| DeepLabV3+ | 76.67 | 94.74 | 85.87 | 86.67 | 86.26 |
| SegNet | 74.29 | 94.21 | 84.76 | 83.11 | 83.93 |
| BuildFormer | 77.31 | 95.01 | 87.26 | 86.32 | 86.79 |
| HD-Net | 76.44 | 94.49 | 85.20 | 85.66 | 85.43 |
| SDSC-UNet | 79.38 | 94.87 | 87.22 | 86.61 | 86.91 |
| MAFNet | 82.21 | 95.15 | 88.37 | 87.38 | 87.87 |
Tab. 3
Comparison of extraction efficiency"
| 模型 | 训练过程处理速度/(batch/s) | 训练总时长/h | 测试过程处理速度/(batch/s) | 测试总时长/min | IoU/(%) | F1值/(%) |
|---|---|---|---|---|---|---|
| FCN-8s | 3.0 | 30.1 | 6.2 | 3.2 | 79.71 | 88.60 |
| U-Net | 3.2 | 28.6 | 6.8 | 3.0 | 77.47 | 87.90 |
| DeepLabV3+ | 2.2 | 39.6 | 5.0 | 4.0 | 86.07 | 91.15 |
| SegNet | 2.8 | 32.2 | 6.0 | 3.4 | 82.70 | 90.55 |
| BuildFormer | 2.5 | 36.3 | 5.6 | 3.6 | 85.96 | 89.47 |
| HD-Net | 2.7 | 32.4 | 5.8 | 3.5 | 84.59 | 88.96 |
| SDSC-UNet | 2.4 | 37.8 | 5.1 | 3.9 | 86.04 | 91.42 |
| MAFNet | 2.7 | 32.4 | 5.9 | 3.4 | 88.01 | 92.79 |
Tab. 4
The influence of residual structure and MAF module on the accuracy index of extraction results"
| 模型 | IoU | Accuracy | Precision | Recall | F1值 |
|---|---|---|---|---|---|
| U-Net | 77.47 | 95.36 | 88.36 | 87.46 | 87.90 |
| Res_UNet | 82.24 | 96.27 | 90.50 | 91.41 | 90.95 |
| Res_CBAM_UNet | 79.37 | 96.05 | 90.69 | 89.15 | 89.91 |
| Res_ECA_UNet | 85.09 | 97.15 | 92.47 | 91.34 | 92.07 |
| Res_SE_UNet | 83.41 | 95.71 | 91.40 | 90.51 | 90.94 |
| MAFNet | 88.01 | 97.55 | 92.38 | 93.22 | 92.79 |
Tab. 5
The influence of mixed loss function on the accuracy index of extraction results"
| 损失函数 | IoU | Accuracy | Precision | Recall | F1值 |
|---|---|---|---|---|---|
| LB | 86.58 | 97.05 | 92.14 | 89.67 | 91.03 |
| LD | 85.41 | 95.84 | 91.84 | 90.76 | 91.29 |
| LL | 86.18 | 96.27 | 92.01 | 93.26 | 92.63 |
| 0.8LB+0.1LD+0.1LL | 85.24 | 96.17 | 90.34 | 90.67 | 90.51 |
| 0.6LB+0.1LD+0.3LL | 86.96 | 97.34 | 90.91 | 93.31 | 92.09 |
| 0.6LB+0.2LD+0.2LL | 88.01 | 97.55 | 92.38 | 93.22 | 92.79 |
| 0.6LB+0.3LD+0.1LL | 85.94 | 96.21 | 90.66 | 92.29 | 91.47 |
| 0.4LB+0.3LD+0.3LL | 86.38 | 96.54 | 91.06 | 91.43 | 91.24 |
| 0.33LB+0.33LD+0.33LL | 85.99 | 96.42 | 91.13 | 90.95 | 91.03 |
| 0.2LB+0.6LD+0.2LL | 85.61 | 96.23 | 90.45 | 91.30 | 90.88 |
| 0.2LB+0.4LD+0.4LL | 85.72 | 96.67 | 90.87 | 91.56 | 91.21 |
| 0.2LB+0.2LD+0.6LL | 86.43 | 96.59 | 90.98 | 91.27 | 91.12 |
| [1] | LUO Lin, LI Pengpeng, YAN Xuesong. Deep learning-based building extraction from remote sensing images: a comprehensive review[J]. Energies, 2021, 14(23): 7982. |
| [2] |
范荣双, 陈洋, 徐启恒, 等. 基于深度学习的高分辨率遥感影像建筑物提取方法[J]. 测绘学报, 2019, 48(1): 34-41. DOI: .
doi: 10.11947/j.AGCS.2019.20170638 |
|
FAN Rongshuang, CHEN Yang, XU Qiheng, et al. A high-resolution remote sensing image building extraction method based on deep learning[J]. Acta Geodaetica et Cartographica Sinica, 2019, 48(1): 34-41. DOI: .
doi: 10.11947/j.AGCS.2019.20170638 |
|
| [3] | 吕少云, 李佳田, 阿晓荟, 等. Res_ASPP_UNet++:结合分离卷积与空洞金字塔的遥感影像建筑物提取网络[J]. 遥感学报, 2023, 27(2): 502-519. |
| LÜ Shaoyun, LI Jiatian, A Xiaohui, et al. Res_ASPP_UNet++: building an extraction network from remote sensing imagery combining depthwise separable convolution with atrous spatial pyramid pooling[J]. National Remote Sensing Bulletin, 2023, 27(2): 502-519. | |
| [4] | PESARESI M, GERHARDINGER A, KAYITAKIRE F. A robust built-up area presence index by anisotropic rotation-invariant textural measure[J]. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 2008, 1(3): 180-192. |
| [5] | 谭衢霖. 高分辨率多光谱影像城区建筑物提取研究[J]. 测绘学报, 2010, 39(6): 618-623. |
| TAN Qulin. Urban building extraction from VHR multi-spectral images using object-based classification[J]. Acta Geodaetica et Cartographica Sinica, 2010, 39(6): 618-623. | |
| [6] |
杜守基, 邹峥嵘, 张云生, 等. 融合LiDAR点云与正射影像的建筑物图割优化提取方法[J]. 测绘学报, 2018, 47(4): 519-527. DOI: .
doi: 10.11947/j.AGCS.2018.20160534 |
|
DU Shouji, ZOU Zhengrong, ZHANG Yunsheng, et al. A building extraction method via graph cuts algorithm by fusion of LiDAR point cloud and orthoimage[J]. Acta Geodaetica et Cartographica Sinica, 2018, 47(4): 519-527. DOI: .
doi: 10.11947/j.AGCS.2018.20160534 |
|
| [7] | KRIZHEVSKY A, SUTSKEVER I, HINTON G E. ImageNet classification with deep convolutional neural networks[J]. Communications of the ACM, 2017, 60(6): 84-90. |
| [8] | HE Kaiming, ZHANG Xiangyu, REN Shaoqing, et al. Deep residual learning for image recognition[C]//Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas: IEEE, 2016: 770-778. |
| [9] | SIMONYAN K, ZISSERMAN A. Very deep convolutional networks for large-scale image recognition[EB/OL]. [2024-05-03]. https://arxiv.org/abs/1409.1556. |
| [10] | SHELHAMER E, LONG J, DARRELL T. Fully convolutional networks for semantic segmentation[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(4): 640-651. |
| [11] | RONNEBERGER O, FISCHER P, BROX T. U-Net: convolutional networks for biomedical image segmentation[C]//Proceedings of 2015 Medical Image Computing and Computer-Assisted Intervention. Cham: Springer International Publishing, 2015: 234-241. |
| [12] | ZHOU Zongwei, RAHMAN SIDDIQUEE M M, TAJBAKHSH N, et al. UNet++: a nested U-Net architecture for medical image segmentation[C]//Proceedings of 2018 Deep Learning in Medical Image Analysis and Multimodal Learning for Clinical Decision Support. Cham: Springer International Publishing, 2018: 3-11. |
| [13] | ZHANG Renhe, ZHANG Qian, ZHANG Guixu. SDSC-UNet: dual skip connection ViT-based U-shaped model for building extraction[J]. IEEE Geoscience and Remote Sensing Letters, 2023, 20: 3270303. |
| [14] | LI Yuxuan, HONG Danfeng, LI Chenyu, et al. HD-Net: high-resolution decoupled network for building footprint extraction via deeply supervised body and boundary decomposition[J]. ISPRS Journal of Photogrammetry and Remote Sensing, 2024, 209: 51-65. |
| [15] | QIU Weiyan, GU Lingjia, GAO Fang, et al. Building extraction from very high-resolution remote sensing images using refine-UNet[J]. IEEE Geoscience and Remote Sensing Letters, 2023, 20: 3243609. |
| [16] | 徐孝彬, 张好杰, 白建波, 等. 基于改进Unet的分布式光伏建筑物高精度分割方法[J]. 太阳能学报, 2023, 44(11): 82-90. |
| XU Xiaobin, ZHANG Haojie, BAI Jianbo, et al. High-precision segmentation method of distributed photovoltaic buildings based on improved UNet[J]. Acta Energiae Solaris Sinica, 2023, 44(11): 82-90. | |
| [17] | VASWANI A, SHAZEER N, PARMAR N, et al. Attention is all you need[C]//Proceedings of the 31st International Conference on Neural Information Processing Systems. Red Hook: Curran Associates Inc., 2017: 6000-6010. |
| [18] | DOSOVITSKIY A, BEYER L, KOLESNIKOV A, et al. An image is worth 16×16 words: Transformers for image recognition at scale[EB/OL]. [2024-05-03]. https://arxiv.org/abs/2010.11929. |
| [19] | WANG Libo, FANG Shenghui, MENG Xiaoliang, et al. Building extraction with vision transformer[J]. IEEE Transactions on Geoscience and Remote Sensing, 2022, 60: 1-11. |
| [20] |
胡明洪, 李佳田, 姚彦吉, 等. 结合多路径的高分辨率遥感影像建筑物提取SER-UNet算法[J]. 测绘学报, 2023, 52(5): 808-817. DOI: .
doi: 10.11947/j.AGCS.2023.20210691 |
|
HU Minghong, LI Jiatian, YAO Yanji, et al. SER-UNet algorithm for building extraction from high-resolution remote sensing image combined with multipath[J]. Acta Geodaetica et Cartographica Sinica, 2023, 52(5): 808-817. DOI: .
doi: 10.11947/j.AGCS.2023.20210691 |
|
| [21] | 刘卓涛, 龚循强, 夏元平, 等. KU-Net:改进U-Net的高分辨率遥感影像建筑物提取方法[J]. 遥感信息, 2024, 39(5): 121-131. |
| LIU Zhuotao, GONG Xunqiang, XIA Yuanping, et al. KU-Net: an improved U-Net method for building extraction from high resolution remote sensing imagery[J]. Remote Sensing Information, 2024, 39(5): 121-131. | |
| [22] | WANG Xiaolei, HU Zirong, SHI Shouhai, et al. A deep learning method for optimizing semantic segmentation accuracy of remote sensing images based on improved UNet[J]. Scientific Reports, 2023, 13(1): 7600. |
| [23] | WANG Libo, LI Rui, ZHANG Ce, et al. UNetFormer: a UNet-like Transformer for efficient semantic segmentation of remote sensing urban scene imagery[J]. ISPRS Journal of Photogrammetry and Remote Sensing, 2022, 190: 196-214. |
| [24] | YU F, KOLTUN V. Multi-scale context aggregation by dilated convolutions[EB/OL]. [2024-05-03]. https://arxiv.org/abs/1511.07122. |
| [25] | CHEN L C, PAPANDREOU G, SCHROFF F, et al. Rethinking atrous convolution for semantic image segmentation[C]//Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition. Honolulu: IEEE, 2017: 1-10 |
| [26] | HU Jie, SHEN Li, ALBANIE S, et al. Squeeze-and-excitation networks[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2020, 42(8): 2011-2023. |
| [27] | JADERBERG M, SIMONYAN K, ZISSERMAN A, et al. Spatial transformer networks[C]//Proceedings of 2015 Advances in Neural Information Processing Systems. Montreal: MIT Press, 2015: 2017-2025. |
| [28] | WOO S, PARK J, LEE J Y, et al. CBAM: convolutional block attention module[C]//Proceedings of 2018 European Conference on Computer Vision. Cham: Springer International Publishing, 2018: 3-19. |
| [29] | MAO Anqi, MOHRI M, ZHONG Yutao. Cross-entropy loss functions: theoretical analysis and applications[EB/OL]. [2024-05-03]. https://arxiv.org/abs/2304.07288v2. |
| [30] | LI Xiaoya, SUN Xiaofei, MENG Yuxian, et al. Dice loss for data-imbalanced NLP tasks[C]//Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Stroudsburg: ACL, 2020: 465-476. |
| [31] | BERMAN M, TRIKI A R, BLASCHKO M B. The lovász-Softmax loss: a tractable surrogate for the optimization of the intersection-over-union measure in neural networks[C]//Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City: IEEE, 2018: 4413-4421. |
| [32] | JI Shunping, WEI Shiqing, LU Meng. Fully convolutional networks for multisource building extraction from an open aerial and satellite imagery data set[J]. IEEE Transactions on Geoscience and Remote Sensing, 2019, 57(1): 574-586. |
| [33] | MNIH V. Machine learning for aerial image labeling[D]. Toronto: University of Toronto, 2013. |
| [34] | BADRINARAYANAN V, KENDALL A, CIPOLLA R. SegNet: a deep convolutional encoder-decoder architecture for image segmentation[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(12): 2481-2495. |
| [35] | CHOLLET F. Xception: deep learning with depthwise separable convolutions[C]//Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition. Honolulu: IEEE, 2017: 1800-1807. |
| [36] | WANG Qilong, WU Banggu, ZHU Pengfei, et al. ECA-Net: efficient channel attention for deep convolutional neural networks[C]//Proceedings of 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle: IEEE, 2020: 11531-11539. |
| [1] | Shunping JI, Jin LIU, Jian GAO, Jianya GONG. An intelligent 3D reconstruction framework via deep learning based multi-view image matching [J]. Acta Geodaetica et Cartographica Sinica, 2025, 54(9): 1633-1646. |
| [2] | Yakun XIE, Yaoji ZHAO, Jiaxing TU, Ruifeng XIA, Dejun FENG, Suning LIU, Hongyu CHEN, Jun ZHU. Edge and global features integrated network for salient object detection in optical remote sensing images [J]. Acta Geodaetica et Cartographica Sinica, 2025, 54(7): 1265-1279. |
| [3] | Chao WANG, Tianyu CHEN, Tong ZHANG, Tanvir AHMED, Liqiang JI, Tao XIE, Jiajun YANG, Shuai WANG. Multi-sensor optical remote sensing images change detection based on global differential enhancement module and balance penalty loss [J]. Acta Geodaetica et Cartographica Sinica, 2025, 54(5): 873-887. |
| [4] | Zhaoyang HOU, Haowen YAN, Liming ZHANG, Rongjuan MA, Ruitao QU. Zero-watermark copyright protection method for remote sensing images based on coupled neural P system and blockchain [J]. Acta Geodaetica et Cartographica Sinica, 2025, 54(12): 2247-2261. |
| [5] | Liangxiong GONG, Xinghua LI, Yuanming CHENG, Xingyou ZHAO, Renping XIE, Honggen WANG. A lightweight remote sensing images change detection network utilizing spatio-temporal difference enhancement and adaptive feature fusion [J]. Acta Geodaetica et Cartographica Sinica, 2025, 54(1): 136-153. |
| [6] | Jialing LI, Ji QI, Weipeng LU, Chao TAO. Self-supervised learning based urban functional zone classification by integrating optical remote sensing image-OSM data [J]. Acta Geodaetica et Cartographica Sinica, 2025, 54(1): 154-164. |
| [7] | Zhiwei XIE, Shuaizhi ZHAI, Fengyuan ZHANG, Min CHEN, Lishuang SUN. Object-oriented high-resolution image classification using inductive graph neural networks [J]. Acta Geodaetica et Cartographica Sinica, 2024, 53(8): 1610-1623. |
| [8] | Haiyan GU, Yi YANG, Haitao LI, Lijian SUN, Shaopeng DING, Shiqi LIU. Dynamic construction of high-resolution remote sensing image sample datasets and intelligent interpretation applications [J]. Acta Geodaetica et Cartographica Sinica, 2024, 53(6): 1165-1179. |
| [9] | Daifeng PENG, Chenchen ZHAI, Dingwei ZHOU, Yongjun ZHANG, Haiyan GUAN, Yufu ZANG. High-resolution optical images change detection based on global information enhancement by pyramid semantic token [J]. Acta Geodaetica et Cartographica Sinica, 2024, 53(6): 1195-1211. |
| [10] | Jicheng WANG, Anmei GUO, Li SHEN, Tian LAN, Zhu XU, Zhilin LI. Multi-level contrastive learning for weakly supervised extraction of urban solid wastes dump from high-resolution remote sensing images [J]. Acta Geodaetica et Cartographica Sinica, 2024, 53(6): 1212-1223. |
| [11] | Shaopeng DING, Xiushan LU, Rufei LIU, Yi YANG, Haiyan GU, Haitao LI. Building change detection method combining object feature guidance and multiple attention mechanism [J]. Acta Geodaetica et Cartographica Sinica, 2024, 53(6): 1224-1235. |
| [12] | Xuetao LI, Pancheng WANG, Yongnian ZENG. Urban impervious surface extraction based on the deep features of high-resolution remote sensing image and ensemble learning [J]. Acta Geodaetica et Cartographica Sinica, 2024, 53(4): 700-711. |
| [13] | HE Yi, YANG Wang, ZHU Qing. An InSAR phase unwrapping method based on R2AU-Net [J]. Acta Geodaetica et Cartographica Sinica, 2024, 53(3): 435-449. |
| [14] | Shiyan PANG, Jingjing HAO, Zhiqi ZUO, Jingjing LAN, Xiangyun HU. A high-resolution remote sensing images change detection method via the integration of dense connections and self-attention mechanisms [J]. Acta Geodaetica et Cartographica Sinica, 2024, 53(12): 2244-2253. |
| [15] | Liying WANG, Kangli ZHANG, Xinao LI, Ze YOU, Yong FENG. An algorithm for building extraction from airborne LiDAR data under adaptive local spatial-spectral consistency [J]. Acta Geodaetica et Cartographica Sinica, 2024, 53(12): 2349-2360. |
| Viewed | ||||||
|
Full text |
|
|||||
|
Abstract |
|
|||||