
测绘学报 ›› 2025, Vol. 54 ›› Issue (6): 1094-1106.doi: 10.11947/j.AGCS.2025.20240439
董子博1(
), 王竞雪2(
), 卜丽静2, 房琳3, 许峥辉1
收稿日期:2024-10-28
修回日期:2025-05-08
出版日期:2025-07-14
发布日期:2025-07-14
通讯作者:
王竞雪
E-mail:472320795@stu.lntu.edu.cn;xiaoxue1861@163.com
作者简介:董子博(2001—),男,硕士生,主要研究方向为遥感影像信息提取。E-mail:472320795@stu.lntu.edu.cn
基金资助:
Zibo DONG1(
), Jingxue WANG2(
), Lijing BU2, Lin FANG3, Zhenghui XU1
Received:2024-10-28
Revised:2025-05-08
Online:2025-07-14
Published:2025-07-14
Contact:
Jingxue WANG
E-mail:472320795@stu.lntu.edu.cn;xiaoxue1861@163.com
About author:DONG Zibo (2001—), male, postgraduate, majors in remote sensing image information extraction. E-mail: 472320795@stu.lntu.edu.cn
Supported by:摘要:
遥感影像建筑物提取对灾害管理、城市规划及变化监测等领域具有重要意义。由于城市建筑物大小不一,一张遥感影像中存在多种不同尺寸大小的建筑物,使得影像中建筑物提取精度不足。为提升影像中不同尺寸大小建筑物的提取精度,本文提出一种利用多尺度空洞融合网络的遥感影像建筑物提取方法。以U-Net网络为基础,首先,在编码器和解码器部分融合残差结构,使其在训练过程中更好地传播梯度;然后,在编码-解码器的桥接部分提出一个多尺度空洞融合模块,该模块利用多种空洞卷积捕捉全局上下文特征,并进一步通过通道和空间注意力机制来增强特征表达,有效提升了影像中不同尺寸建筑物的提取精度;最后,通过设计一个混合损失函数提升整体的边界提取效果。基于WHU building和Massachusetts building数据集进行试验,并将本文方法与当前主流的语义分割网络进行对比。试验结果表明,本文方法可以显著地提升影像建筑物提取精度,能够适应各种尺寸大小的建筑物提取,对于建筑物边界的提取更加完整和平滑。
中图分类号:
董子博, 王竞雪, 卜丽静, 房琳, 许峥辉. MAFNet:基于多尺度空洞融合网络的遥感影像建筑物提取方法[J]. 测绘学报, 2025, 54(6): 1094-1106.
Zibo DONG, Jingxue WANG, Lijing BU, Lin FANG, Zhenghui XU. MAFNet: building extraction method from remote sensing images based on multi-scale atrous fusion network[J]. Acta Geodaetica et Cartographica Sinica, 2025, 54(6): 1094-1106.
表1
不同网络在WHU building数据集的提取精度对比"
| 模型 | IoU | Accuracy | Precision | Recall | F1值 |
|---|---|---|---|---|---|
| FCN-8s | 79.71 | 95.43 | 88.68 | 89.76 | 88.60 |
| U-Net | 77.47 | 95.36 | 88.36 | 87.46 | 87.90 |
| DeepLabV3+ | 86.07 | 96.48 | 92.28 | 90.05 | 91.15 |
| SegNet | 82.70 | 96.95 | 91.39 | 89.74 | 90.55 |
| BuildFormer | 85.96 | 97.03 | 89.51 | 89.45 | 89.47 |
| HD-Net | 84.59 | 96.41 | 89.03 | 88.89 | 88.96 |
| SDSC-UNet | 86.04 | 97.24 | 91.97 | 90.88 | 91.42 |
| MAFNet | 88.01 | 97.55 | 92.38 | 93.22 | 92.79 |
表2
不同网络在Massachusetts building数据集的提取精度对比"
| 模型 | IoU | Accuracy | Precision | Recall | F1值 |
|---|---|---|---|---|---|
| FCN-8s | 71.82 | 93.96 | 81.44 | 80.94 | 81.18 |
| U-Net | 73.43 | 94.12 | 82.35 | 83.66 | 82.99 |
| DeepLabV3+ | 76.67 | 94.74 | 85.87 | 86.67 | 86.26 |
| SegNet | 74.29 | 94.21 | 84.76 | 83.11 | 83.93 |
| BuildFormer | 77.31 | 95.01 | 87.26 | 86.32 | 86.79 |
| HD-Net | 76.44 | 94.49 | 85.20 | 85.66 | 85.43 |
| SDSC-UNet | 79.38 | 94.87 | 87.22 | 86.61 | 86.91 |
| MAFNet | 82.21 | 95.15 | 88.37 | 87.38 | 87.87 |
表3
提取效率对比"
| 模型 | 训练过程处理速度/(batch/s) | 训练总时长/h | 测试过程处理速度/(batch/s) | 测试总时长/min | IoU/(%) | F1值/(%) |
|---|---|---|---|---|---|---|
| FCN-8s | 3.0 | 30.1 | 6.2 | 3.2 | 79.71 | 88.60 |
| U-Net | 3.2 | 28.6 | 6.8 | 3.0 | 77.47 | 87.90 |
| DeepLabV3+ | 2.2 | 39.6 | 5.0 | 4.0 | 86.07 | 91.15 |
| SegNet | 2.8 | 32.2 | 6.0 | 3.4 | 82.70 | 90.55 |
| BuildFormer | 2.5 | 36.3 | 5.6 | 3.6 | 85.96 | 89.47 |
| HD-Net | 2.7 | 32.4 | 5.8 | 3.5 | 84.59 | 88.96 |
| SDSC-UNet | 2.4 | 37.8 | 5.1 | 3.9 | 86.04 | 91.42 |
| MAFNet | 2.7 | 32.4 | 5.9 | 3.4 | 88.01 | 92.79 |
表5
混合损失函数对提取结果精度指标的影响"
| 损失函数 | IoU | Accuracy | Precision | Recall | F1值 |
|---|---|---|---|---|---|
| LB | 86.58 | 97.05 | 92.14 | 89.67 | 91.03 |
| LD | 85.41 | 95.84 | 91.84 | 90.76 | 91.29 |
| LL | 86.18 | 96.27 | 92.01 | 93.26 | 92.63 |
| 0.8LB+0.1LD+0.1LL | 85.24 | 96.17 | 90.34 | 90.67 | 90.51 |
| 0.6LB+0.1LD+0.3LL | 86.96 | 97.34 | 90.91 | 93.31 | 92.09 |
| 0.6LB+0.2LD+0.2LL | 88.01 | 97.55 | 92.38 | 93.22 | 92.79 |
| 0.6LB+0.3LD+0.1LL | 85.94 | 96.21 | 90.66 | 92.29 | 91.47 |
| 0.4LB+0.3LD+0.3LL | 86.38 | 96.54 | 91.06 | 91.43 | 91.24 |
| 0.33LB+0.33LD+0.33LL | 85.99 | 96.42 | 91.13 | 90.95 | 91.03 |
| 0.2LB+0.6LD+0.2LL | 85.61 | 96.23 | 90.45 | 91.30 | 90.88 |
| 0.2LB+0.4LD+0.4LL | 85.72 | 96.67 | 90.87 | 91.56 | 91.21 |
| 0.2LB+0.2LD+0.6LL | 86.43 | 96.59 | 90.98 | 91.27 | 91.12 |
| [1] | LUO Lin, LI Pengpeng, YAN Xuesong. Deep learning-based building extraction from remote sensing images: a comprehensive review[J]. Energies, 2021, 14(23): 7982. |
| [2] |
范荣双, 陈洋, 徐启恒, 等. 基于深度学习的高分辨率遥感影像建筑物提取方法[J]. 测绘学报, 2019, 48(1): 34-41. DOI: .
doi: 10.11947/j.AGCS.2019.20170638 |
|
FAN Rongshuang, CHEN Yang, XU Qiheng, et al. A high-resolution remote sensing image building extraction method based on deep learning[J]. Acta Geodaetica et Cartographica Sinica, 2019, 48(1): 34-41. DOI: .
doi: 10.11947/j.AGCS.2019.20170638 |
|
| [3] | 吕少云, 李佳田, 阿晓荟, 等. Res_ASPP_UNet++:结合分离卷积与空洞金字塔的遥感影像建筑物提取网络[J]. 遥感学报, 2023, 27(2): 502-519. |
| LÜ Shaoyun, LI Jiatian, A Xiaohui, et al. Res_ASPP_UNet++: building an extraction network from remote sensing imagery combining depthwise separable convolution with atrous spatial pyramid pooling[J]. National Remote Sensing Bulletin, 2023, 27(2): 502-519. | |
| [4] | PESARESI M, GERHARDINGER A, KAYITAKIRE F. A robust built-up area presence index by anisotropic rotation-invariant textural measure[J]. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 2008, 1(3): 180-192. |
| [5] | 谭衢霖. 高分辨率多光谱影像城区建筑物提取研究[J]. 测绘学报, 2010, 39(6): 618-623. |
| TAN Qulin. Urban building extraction from VHR multi-spectral images using object-based classification[J]. Acta Geodaetica et Cartographica Sinica, 2010, 39(6): 618-623. | |
| [6] |
杜守基, 邹峥嵘, 张云生, 等. 融合LiDAR点云与正射影像的建筑物图割优化提取方法[J]. 测绘学报, 2018, 47(4): 519-527. DOI: .
doi: 10.11947/j.AGCS.2018.20160534 |
|
DU Shouji, ZOU Zhengrong, ZHANG Yunsheng, et al. A building extraction method via graph cuts algorithm by fusion of LiDAR point cloud and orthoimage[J]. Acta Geodaetica et Cartographica Sinica, 2018, 47(4): 519-527. DOI: .
doi: 10.11947/j.AGCS.2018.20160534 |
|
| [7] | KRIZHEVSKY A, SUTSKEVER I, HINTON G E. ImageNet classification with deep convolutional neural networks[J]. Communications of the ACM, 2017, 60(6): 84-90. |
| [8] | HE Kaiming, ZHANG Xiangyu, REN Shaoqing, et al. Deep residual learning for image recognition[C]//Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas: IEEE, 2016: 770-778. |
| [9] | SIMONYAN K, ZISSERMAN A. Very deep convolutional networks for large-scale image recognition[EB/OL]. [2024-05-03]. https://arxiv.org/abs/1409.1556. |
| [10] | SHELHAMER E, LONG J, DARRELL T. Fully convolutional networks for semantic segmentation[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(4): 640-651. |
| [11] | RONNEBERGER O, FISCHER P, BROX T. U-Net: convolutional networks for biomedical image segmentation[C]//Proceedings of 2015 Medical Image Computing and Computer-Assisted Intervention. Cham: Springer International Publishing, 2015: 234-241. |
| [12] | ZHOU Zongwei, RAHMAN SIDDIQUEE M M, TAJBAKHSH N, et al. UNet++: a nested U-Net architecture for medical image segmentation[C]//Proceedings of 2018 Deep Learning in Medical Image Analysis and Multimodal Learning for Clinical Decision Support. Cham: Springer International Publishing, 2018: 3-11. |
| [13] | ZHANG Renhe, ZHANG Qian, ZHANG Guixu. SDSC-UNet: dual skip connection ViT-based U-shaped model for building extraction[J]. IEEE Geoscience and Remote Sensing Letters, 2023, 20: 3270303. |
| [14] | LI Yuxuan, HONG Danfeng, LI Chenyu, et al. HD-Net: high-resolution decoupled network for building footprint extraction via deeply supervised body and boundary decomposition[J]. ISPRS Journal of Photogrammetry and Remote Sensing, 2024, 209: 51-65. |
| [15] | QIU Weiyan, GU Lingjia, GAO Fang, et al. Building extraction from very high-resolution remote sensing images using refine-UNet[J]. IEEE Geoscience and Remote Sensing Letters, 2023, 20: 3243609. |
| [16] | 徐孝彬, 张好杰, 白建波, 等. 基于改进Unet的分布式光伏建筑物高精度分割方法[J]. 太阳能学报, 2023, 44(11): 82-90. |
| XU Xiaobin, ZHANG Haojie, BAI Jianbo, et al. High-precision segmentation method of distributed photovoltaic buildings based on improved UNet[J]. Acta Energiae Solaris Sinica, 2023, 44(11): 82-90. | |
| [17] | VASWANI A, SHAZEER N, PARMAR N, et al. Attention is all you need[C]//Proceedings of the 31st International Conference on Neural Information Processing Systems. Red Hook: Curran Associates Inc., 2017: 6000-6010. |
| [18] | DOSOVITSKIY A, BEYER L, KOLESNIKOV A, et al. An image is worth 16×16 words: Transformers for image recognition at scale[EB/OL]. [2024-05-03]. https://arxiv.org/abs/2010.11929. |
| [19] | WANG Libo, FANG Shenghui, MENG Xiaoliang, et al. Building extraction with vision transformer[J]. IEEE Transactions on Geoscience and Remote Sensing, 2022, 60: 1-11. |
| [20] |
胡明洪, 李佳田, 姚彦吉, 等. 结合多路径的高分辨率遥感影像建筑物提取SER-UNet算法[J]. 测绘学报, 2023, 52(5): 808-817. DOI: .
doi: 10.11947/j.AGCS.2023.20210691 |
|
HU Minghong, LI Jiatian, YAO Yanji, et al. SER-UNet algorithm for building extraction from high-resolution remote sensing image combined with multipath[J]. Acta Geodaetica et Cartographica Sinica, 2023, 52(5): 808-817. DOI: .
doi: 10.11947/j.AGCS.2023.20210691 |
|
| [21] | 刘卓涛, 龚循强, 夏元平, 等. KU-Net:改进U-Net的高分辨率遥感影像建筑物提取方法[J]. 遥感信息, 2024, 39(5): 121-131. |
| LIU Zhuotao, GONG Xunqiang, XIA Yuanping, et al. KU-Net: an improved U-Net method for building extraction from high resolution remote sensing imagery[J]. Remote Sensing Information, 2024, 39(5): 121-131. | |
| [22] | WANG Xiaolei, HU Zirong, SHI Shouhai, et al. A deep learning method for optimizing semantic segmentation accuracy of remote sensing images based on improved UNet[J]. Scientific Reports, 2023, 13(1): 7600. |
| [23] | WANG Libo, LI Rui, ZHANG Ce, et al. UNetFormer: a UNet-like Transformer for efficient semantic segmentation of remote sensing urban scene imagery[J]. ISPRS Journal of Photogrammetry and Remote Sensing, 2022, 190: 196-214. |
| [24] | YU F, KOLTUN V. Multi-scale context aggregation by dilated convolutions[EB/OL]. [2024-05-03]. https://arxiv.org/abs/1511.07122. |
| [25] | CHEN L C, PAPANDREOU G, SCHROFF F, et al. Rethinking atrous convolution for semantic image segmentation[C]//Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition. Honolulu: IEEE, 2017: 1-10 |
| [26] | HU Jie, SHEN Li, ALBANIE S, et al. Squeeze-and-excitation networks[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2020, 42(8): 2011-2023. |
| [27] | JADERBERG M, SIMONYAN K, ZISSERMAN A, et al. Spatial transformer networks[C]//Proceedings of 2015 Advances in Neural Information Processing Systems. Montreal: MIT Press, 2015: 2017-2025. |
| [28] | WOO S, PARK J, LEE J Y, et al. CBAM: convolutional block attention module[C]//Proceedings of 2018 European Conference on Computer Vision. Cham: Springer International Publishing, 2018: 3-19. |
| [29] | MAO Anqi, MOHRI M, ZHONG Yutao. Cross-entropy loss functions: theoretical analysis and applications[EB/OL]. [2024-05-03]. https://arxiv.org/abs/2304.07288v2. |
| [30] | LI Xiaoya, SUN Xiaofei, MENG Yuxian, et al. Dice loss for data-imbalanced NLP tasks[C]//Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Stroudsburg: ACL, 2020: 465-476. |
| [31] | BERMAN M, TRIKI A R, BLASCHKO M B. The lovász-Softmax loss: a tractable surrogate for the optimization of the intersection-over-union measure in neural networks[C]//Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City: IEEE, 2018: 4413-4421. |
| [32] | JI Shunping, WEI Shiqing, LU Meng. Fully convolutional networks for multisource building extraction from an open aerial and satellite imagery data set[J]. IEEE Transactions on Geoscience and Remote Sensing, 2019, 57(1): 574-586. |
| [33] | MNIH V. Machine learning for aerial image labeling[D]. Toronto: University of Toronto, 2013. |
| [34] | BADRINARAYANAN V, KENDALL A, CIPOLLA R. SegNet: a deep convolutional encoder-decoder architecture for image segmentation[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(12): 2481-2495. |
| [35] | CHOLLET F. Xception: deep learning with depthwise separable convolutions[C]//Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition. Honolulu: IEEE, 2017: 1800-1807. |
| [36] | WANG Qilong, WU Banggu, ZHU Pengfei, et al. ECA-Net: efficient channel attention for deep convolutional neural networks[C]//Proceedings of 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle: IEEE, 2020: 11531-11539. |
| [1] | 李海峰, 郭旺, 吴梦伟, 彭程里, 朱庆, 刘瑜, 陶超. 视觉-语言联合的遥感地物概念表达与智能解译:原理、挑战与机遇[J]. 测绘学报, 2025, 54(5): 853-872. |
| [2] | 王超, 陈天宇, 张同, AhmedTanvir, 纪立强, 谢涛, 杨佳俊, 王帅. 基于全局差分增强模块和平衡惩罚损失的多源光学遥感影像变化检测[J]. 测绘学报, 2025, 54(5): 873-887. |
| [3] | 赵一鸣, 胡克林, 涂可龙, 卿雅娴, 杨超, 祁昆仑, 吴华意. 基于SAR与光学遥感影像融合的多标签场景分类方法[J]. 测绘学报, 2025, 54(5): 911-923. |
| [4] | 张新长, 齐霁, 陶超, 傅思扬, 郭明宁, 阮永检. 光学遥感影像去云研究进展、挑战与趋势[J]. 测绘学报, 2025, 54(4): 603-620. |
| [5] | 龚良雄, 李星华, 程远明, 赵兴友, 谢仁平, 王红根. 时空差异增强与自适应特征融合的轻量级遥感影像变化检测网络[J]. 测绘学报, 2025, 54(1): 136-153. |
| [6] | 李佳铃, 齐霁, 鲁伟鹏, 陶超. 面向城市功能区分类的光学遥感影像-OSM数据联合自监督学习方法[J]. 测绘学报, 2025, 54(1): 154-164. |
| [7] | 鄢薪, 慎利, 潘俊杰, 戴延帅, 王继成, 郑晓莉, 李志林. 多尺度特征融合与空间优化的弱监督高分遥感建筑变化检测[J]. 测绘学报, 2024, 53(8): 1586-1597. |
| [8] | 谢志伟, 翟帅智, 张丰源, 陈旻, 孙立双. 面向对象高分影像归纳式图神经网络分类法[J]. 测绘学报, 2024, 53(8): 1610-1623. |
| [9] | 杨军, 解恒静, 范红超, 闫浩文. 遥感影像目标检测多尺度熵神经网络架构搜索[J]. 测绘学报, 2024, 53(7): 1384-1400. |
| [10] | 殷吉崇, 武芳, 翟仁健, 邱越, 巩现勇, 行瑞星. 面向建筑物轮廓规则化的双路径边界约束与相对论生成对抗网络[J]. 测绘学报, 2024, 53(7): 1444-1457. |
| [11] | 宁晓刚, 张翰超, 张瑞倩. 遥感影像高可信智能不变检测技术框架与方法实践[J]. 测绘学报, 2024, 53(6): 1098-1112. |
| [12] | 顾海燕, 杨懿, 李海涛, 孙立坚, 丁少鹏, 刘世琦. 高分辨率遥感影像样本库动态构建与智能解译应用[J]. 测绘学报, 2024, 53(6): 1165-1179. |
| [13] | 彭代锋, 翟晨晨, 周顶蔚, 张永军, 管海燕, 臧玉府. 基于金字塔语义token全局信息增强的高分光学遥感影像变化检测[J]. 测绘学报, 2024, 53(6): 1195-1211. |
| [14] | 王继成, 郭安嵋, 慎利, 蓝天, 徐柱, 李志林. 多级对比学习下的弱监督高分遥感影像城市固废堆场提取[J]. 测绘学报, 2024, 53(6): 1212-1223. |
| [15] | 丁少鹏, 卢秀山, 刘如飞, 杨懿, 顾海燕, 李海涛. 联合目标特征引导与多重注意力的建筑物变化检测[J]. 测绘学报, 2024, 53(6): 1224-1235. |
| 阅读次数 | ||||||
|
全文 |
|
|||||
|
摘要 |
|
|||||