Hierarchical feature and diversified attention fusion network for collaborative extraction of road surface and centerline

doi:10.11947/j.AGCS.2026.20250446

Abstract

Abstract:

Deep learning has become the dominant approach for automatic road network extraction based on spatio-temporal data. However, due to significant variations in road scale and frequent occlusions, existing methods often suffer from road discontinuities, missing detections, and jagged boundaries. To address these challenges, this paper proposes a hierarchical feature-aware and diversified-attention-based collaborative road surface and centerline extraction network (HFDA-Net). The proposed network takes single-source imagery or multi-source data as input and adopts a dual-branch collaborative modeling strategy for road network extraction. First, a hierarchical feature interaction and fusion module (HFIFM) is designed to couple convolutional neural networks with Transformer architectures, enabling effective fusion of local details and global semantic information across multiple feature levels. Second, to enhance the perception of linear road structures and improve feature discriminability, a state-space global scanning enhancement module (SGSEM) and a diversified attention refinement module (DARM) are introduced. Finally, a dual-branch decoder based on a graph transformer (DDGT) is constructed to explicitly model the spatial-structural co-existence between road surfaces and centerlines, achieving complementary information exchange and collaborative prediction during decoding, thereby improving the completeness of road network extraction. Experimental results on the BJRoad, Massachusetts, and City-scale datasets demonstrate that the proposed method outperforms state-of-the-art approaches in key metrics such as IoU, F₁-score, and TOPO, effectively alleviating road discontinuity and missing detection issues. The proposed method provides robust technical support for large-scale road network updating and intelligent driving applications.

Key words: road network extraction, heterogeneous network collaboration, diversified attention, hierarchical features, semantic segmentation

CLC Number:

P208

Zejiao WANG, Longgang XIANG, Meng WANG, Xingjuan WANG, Qing LIU. Hierarchical feature and diversified attention fusion network for collaborative extraction of road surface and centerline[J]. Acta Geodaetica et Cartographica Sinica, 2026, 55(3): 548-563.

Add to citation manager EndNote|Reference Manager|ProCite|BibTeX|RefWorks

URL: http://xb.chinasmp.com/EN/10.11947/j.AGCS.2026.20250446

http://xb.chinasmp.com/EN/Y2026/V55/I3/548

Figures/Tables 21

Fig. 1

Fig. 2

Fig. 3

Fig. 4

Fig. 5

Tab. 1

Fig. 6

Tab. 2

Fig. 7

Tab. 3

Fig. 8

Tab. 4

Fig. 9

Tab. 5

Fig. 10

Fig. 11

Tab. 6

Fig. 12

Fig. 13

Fig. 14

Fig. 15

References 35

[1]	WANG Xuan, JIN Xizhi, DAI Zhe, et al. Deep learning-based methods for road extraction from remote sensing images: a vision, survey, and future directions[J]. IEEE Geoscience and Remote Sensing Magazine, 2025, 13(1): 55-78.
[2]	LI S Z. Markov random field modeling in image analysis[M]. Tokyo: Springer Japan, 2001.
[3]	LONG J, SHELHAMER E, DARRELL T. Fully convolutional networks for semantic segmentation[C]//Proceedings of 2015 IEEE Conference on Computer Vision and Pattern Recognition. Boston: IEEE, 2015: 3431-3440.
[4]	ZHOU Lichen, ZHANG Chuang, WU Ming. D-LinkNet: LinkNet with pretrained encoder and dilated convolution for high resolution satellite imagery road extraction[C]//Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops. Salt Lake City: IEEE, 2018: 192-1924.
[5]	YANG Ruoyu, ZHONG Yanfei, LIU Yinhe, et al. Occlusion-aware road extraction network for high-resolution remote sensing imagery[J]. IEEE Transactions on Geoscience and Remote Sensing, 2024, 62: 5619316.
[6]	孙根云, 孙超, 张爱竹. 融合多尺度与边缘特征的道路提取网络[J]. 测绘学报, 2024, 53(12): 2233-2243. DOI: . doi: 10.11947/j.AGCS.2024.20230291
	SUN Genyun, SUN Chao, ZHANG Aizhu. Road extraction networks fusing multiscale and edge features[J]. Acta Geodaetica et Cartographica Sinica, 2024, 53(12): 2233-2243. DOI: . doi: 10.11947/j.AGCS.2024.20230291
[7]	王艳军, 唐徐超, 王成, 等. 基于道路拓扑关联特征的城乡道路面精细提取网络[J]. 测绘学报, 2025, 54(1): 75-89. DOI: . doi: 10.11947/j.AGCS.2025.20240124
	WANG Yanjun, TANG Xuchao, WANG Cheng, et al. Urban and rural road surface extraction network based on road topologi-cal correlation features[J]. Acta Geodaetica et Cartographica Sinica, 2025, 54(1): 75-89. DOI: . doi: 10.11947/j.AGCS.2025.20240124
[8]	JAMALI A, ROY S K, LI J, et al. Neighborhood attention makes the encoder of ResUNet stronger for accurate road extraction[J]. IEEE Geoscience and Remote Sensing Letters, 2024, 21: 6003005.
[9]	GAO Lipeng, ZHOU Yiqing, TIAN Jiangtao, et al. DDCTNet: a deformable and dynamic cross-transformer network for road extraction from high-resolution remote sensing images[J]. IEEE Transactions on Geoscience and Remote Sensing, 2024, 62: 4407819.
[10]	徐永洋, 王健, 吴亮, 等. 顾及道路边界与连通性的道路提取方法研究[J]. 测绘学报, 2025, 54(7): 1254-1264. DOI: . doi: 10.11947/j.AGCS.2025.20240271
	XU Yongyang, WANG Jian, WU Liang, et al. Research on road extraction considering road boundaries and connectivity[J]. Acta Geodaetica et Cartographica Sinica, 2025, 54(7): 1254-1264. DOI: . doi: 10.11947/j.AGCS.2025.20240271
[11]	LIU Ruyi, MIAO Qiguang, SONG Jianfeng, et al. Multiscale road centerlines extraction from high-resolution aerial imagery[J]. Neurocomputing, 2019, 329: 384-396.
[12]	HE Songtao, BASTANI F, JAGWANI S, et al. Sat2Graph: road graph extraction through graph-tensor encoding[C]//Proceedings of 2020 Computer Vision. Cham: Springer, 2020: 51-67.
[13]	XU Zhenhua, LIU Yuxuan, SUN Yuxiang, et al. RNGDet++: road network graph detection by transformer with instance segmentation and multi-scale features enhancement[J]. IEEE Robotics and Automation Letters, 2023, 8(5): 2991-2998.
[14]	YIN Pan, LI Kaiyu, CAO Xiangyong, et al. Towards satellite image road graph extraction: a global-scale dataset and a novel method[C]//Proceedings of 2025 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Nashville: IEEE, 2025: 1527-1537.
[15]	LIN Yuzhun, JIN Fei, WANG Dandi, et al. Dual-task network for road extraction from high-resolution remote sensing images[J]. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 2023, 16: 66-78.
[16]	LU Xiaoyan, ZHONG Yanfei, ZHENG Zhuo, et al. Multi-scale and multi-task deep learning framework for automatic road extraction[J]. IEEE Transactions on Geoscience and Remote Sensing, 2019, 57(11): 9362-9377.
[17]	YAN Jingjing, JI Shunping, WEI Yao. A combination of convolutional and graph neural networks for regularized road surface extraction[J]. IEEE Transactions on Geoscience and Remote Sensing, 2022, 60: 4409113.
[18]	VASWANI A, SHAZEER N, PARMAR N, et al. Attention is all you need[J]. Advances in neural information processing systems, 2017, 30: 5216457.
[19]	DOSOVITSKIY A, BEYER L, KOLESNIKOV A, et al. An image is worth 16×16 words: Transformers for image recognition at scale[EB/OL]. [2025-11-12]. https://arxiv.org/abs/2010.11929.
[20]	MNIH V, HINTON G E. Learning to detect roads in high-resolution aerial images[C]//Proceedings of 2010 Computer Vision. Berlin: Springer, 2010: 210-223.
[21]	SUN Tao, DI Zonglin, CHE Pengyu, et al. Leveraging crowdsourced GPS data for road extraction from aerial imagery[C]//Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach: IEEE, 2020: 7501-7510.
[22]	BIAGIONI J, ERIKSSON J. Inferring road maps from global positioning system traces: survey and comparative evaluation[J]. Transportation Research Record: Journal of the Transportation Research Board, 2012, 2291(1): 61-71.
[23]	VAN ETTEN A, LINDENBAUM D, BACASTOW T M. SpaceNet: a remote sensing dataset and challenge series[EB/OL]. [2025-11-02]. https://arxiv.org/abs/1807.01232.
[24]	DIAKOGIANNIS F I, WALDNER F, CACCETTA P, et al. ResUNet-a: a deep learning framework for semantic segmentation of remotely sensed data[J]. ISPRS Journal of Photogrammetry and Remote Sensing, 2020, 162: 94-114.
[25]	BATRA A, SINGH S, PANG Guan, et al. Improved road connectivity by joint learning of orientation and segmentation[C]//Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach: IEEE, 2020: 10377-10385.
[26]	XIE Enze, WANG Wenhai, YU Zhiding, et al. SegFormer: simple and efficient design for semantic segmentation with Transformers[EB/OL]. [2025-10-16]. https://arxiv.org/abs/2105.15203.
[27]	LIU Lingbo, YANG Zewei, LI Guanbin, et al. Aerial images meet crowdsourced trajectories: a new approach to robust road extraction[J]. IEEE Transactions on Neural Networks and Learning Systems, 2023, 34(7): 3308-3322.
[28]	WANG Haijuan, BAI Lin, XUE Danni, et al. FRCFNet: feature reassembly and context information fusion network for road extraction[J]. IEEE Geoscience and Remote Sensing Letters, 2024, 21: 2502805.
[29]	SONG Runtian, SHI Fan, DU Guikang, et al. MG-RoadNet: road segmentation network for remote sensing images based on multi-receptive field graph convolution[J]. Signal, Image and Video Processing, 2025, 19(8): 679.
[30]	WANG Yuchuan, TONG Ling, LUO Shiyu, et al. A multiscale and multidirection feature fusion network for road detection from satellite imagery[J]. IEEE Transactions on Geoscience and Remote Sensing, 2024, 62: 5615718.
[31]	FENG Jie, HUANG Hao, ZHANG Junpeng, et al. SA-MixNet: structure-aware mix-up and invariance learning for scribble-supervised road extraction in remote sensing images[J]. IEEE Transactions on Geoscience and Remote Sensing, 2025, 63: 5602214.
[32]	BATRA A, SINGH S, PANG Guan, et al. Improved road connectivity by joint learning of orientation and segmentation[C]//Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach: IEEE, 2020: 10377-10385.
[33]	YU F, WANG Dequan, SHELHAMER E, et al. Deep layer aggregation[C]//Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City: IEEE, 2018.
[34]	HE Y, GARG R, CHOWDHURY A R. TD-road: top-down road network extraction with Holistic graph construction[C]//Proceedings of 2022 Computer Vision. Cham: Springer, 2022: 562-577.
[35]	HETANG Congrui, XUE Haoru, LE C, et al. Segment anything model for road network graph extraction[C]//Proceedings of 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops. Seattle: IEEE, 2024: 2556-2566.

方法	Precision/（%）	Recall/（%）	F₁值/（%）	IoU/（%）	多任务	参数量/MB	FLOPs（G）	显存/GB	训练时间/（s/Epoch）
ResUnet	84.57	90.55	87.36	77.65	N	26.64	1 224.62	10.17	114
SegFormer	85.86	88.76	87.17	77.36	N	13.68	53.37	4.63	45
RoadCon	86.90	90.96	88.79	79.94	N	29.01	665.82	7.40	84
D-LinkNet	87.13	91.45	89.15	80.53	N	31.22	144.79	2.96	30
CMMPNet	87.99	89.24	88.49	79.43	N	84.99	317.03	7.28	84
OARENet	86.79	90.25	88.40	79.30	N	65.98	401.74	9.45	89
FRCFNet	84.64	90.22	87.23	77.44	N	7.89	151.87	8.21	173
MSMDFFNet	85.79	90.30	87.91	78.51	N	39.27	599.30	8.33	96
MG-RoadNet	88.08	89.61	88.77	79.85	N	70.95	4 443.10	36.68	21 654
SA-MixNet	86.05	91.81	88.75	79.87	N	33.02	1 488.03	5.42	93
HFDA-Net（M1）	87.17	91.14	88.96	80.29	Y	108.95	269.98	5.51	25
HFDA-Net（M4）	88.34	90.59	89.29	80.82	Y	174.64	755.08	7.50	36

方法	F₁值/（%）	Precision/（%）	Recall/（%）	APLS/（%）	多任务	参数量/MB	FLOPs（G）	显存/GB	训练时间/（s/Epoch）
RNGDet++	76.98	92.57	65.88	67.36	N	41.28	809.01	9.28	29
SAM-Road	81.58	93.30	72.48	72.64	N	87.06	209.154	5.04	21
SAM-Road++	77.52	93.04	66.43	69.94	N	87.08	209.24	4.98	46
HFDA-Net	84.22	95.48	75.33	79.46	Y	174.64	755.08	7.50	36

方法	Precision	Recall	F₁值	IoU
D-LinkNet	79.75	76.32	77.76	63.95
U-Net	80.67	76.68	78.40	64.80
CMMPNet	79.94	76.66	78.07	64.37
RoadCon	80.84	76.37	78.32	64.69
OARENet	78.73	76.02	77.09	63.11
FRCFNet	79.67	77.64	78.44	64.86
MSMDFFNet	79.76	77.66	78.51	64.94
MG-RoadNet	83.37	72.33	77.46	63.21
SA-MixNet	73.66	80.13	76.77	62.28
HFDA-Net	81.19	77.03	78.81	65.34

方法	F₁值	Precision	Recall	APLS	方法	F₁值	Precision	Recall	APLS
Seg-Improved	72.20	75.83	68.90	55.34	RNGDet++	78.44	85.65	72.58	67.76
Seg-DLA	73.89	75.59	72.26	57.22	SAM-Road	77.23	90.47	67.69	68.37
Sat2Graph	76.26	80.70	72.28	63.14	SAM-Road++	80.66	89.08	74.07	69.55
TD-Road	76.43	81.94	71.63	65.74	HFDA-Net	80.93	87.12	75.56	68.50

模型	HFIFM	SGSEM	DARM	EWA	道路中心线提取指标				道路面提取指标
模型	HFIFM	SGSEM	DARM	EWA	F₁值	Precision	Recall	APLS	IoU	F₁值	Precision	Recall
M0					79.34	92.01	69.74	74.17	79.33	88.36	85.20	92.18
M1				√	81.01	92.98	71.78	76.41	80.29	88.96	87.17	91.14
M2	√			√	82.53	93.95	73.58	77.69	80.33	88.98	87.44	90.92
M3	√	√		√	83.61	94.11	75.22	78.28	79.76	88.61	87.80	89.84
M4	√	√	√	√	84.22	95.48	75.33	79.46	80.82	89.29	88.34	90.59