Acta Geodaetica et Cartographica Sinica ›› 2026, Vol. 55 ›› Issue (3): 548-563.doi: 10.11947/j.AGCS.2026.20250446

• Cartography and Geographic Information • Previous Articles     Next Articles

Hierarchical feature and diversified attention fusion network for collaborative extraction of road surface and centerline

Zejiao WANG1(), Longgang XIANG1(), Meng WANG1, Xingjuan WANG1, Qing LIU2   

  1. 1.State Key Laboratory of Information Engineering in Surveying, Mapping and Remote Sensing, Wuhan University, Wuhan 430079, China
    2.Faculty of Geomatics, Lanzhou Jiaotong University, Lanzhou 730070, China
  • Received:2025-10-23 Revised:2026-03-18 Online:2026-04-16 Published:2026-04-16
  • Contact: Longgang XIANG E-mail:zjwang@whu.edu.cn;geoxlg@whu.edu.cn
  • About author:WANG Zejiao (1993—), male, PhD candidate, majors in spatio-temporal data analysis and applications, and artificial intelligence. E-mail: zjwang@whu.edu.cn
  • Supported by:
    The National Natural Science Foundation of China(42471460; 42071432)

Abstract:

Deep learning has become the dominant approach for automatic road network extraction based on spatio-temporal data. However, due to significant variations in road scale and frequent occlusions, existing methods often suffer from road discontinuities, missing detections, and jagged boundaries. To address these challenges, this paper proposes a hierarchical feature-aware and diversified-attention-based collaborative road surface and centerline extraction network (HFDA-Net). The proposed network takes single-source imagery or multi-source data as input and adopts a dual-branch collaborative modeling strategy for road network extraction. First, a hierarchical feature interaction and fusion module (HFIFM) is designed to couple convolutional neural networks with Transformer architectures, enabling effective fusion of local details and global semantic information across multiple feature levels. Second, to enhance the perception of linear road structures and improve feature discriminability, a state-space global scanning enhancement module (SGSEM) and a diversified attention refinement module (DARM) are introduced. Finally, a dual-branch decoder based on a graph transformer (DDGT) is constructed to explicitly model the spatial-structural co-existence between road surfaces and centerlines, achieving complementary information exchange and collaborative prediction during decoding, thereby improving the completeness of road network extraction. Experimental results on the BJRoad, Massachusetts, and City-scale datasets demonstrate that the proposed method outperforms state-of-the-art approaches in key metrics such as IoU, F1-score, and TOPO, effectively alleviating road discontinuity and missing detection issues. The proposed method provides robust technical support for large-scale road network updating and intelligent driving applications.

Key words: road network extraction, heterogeneous network collaboration, diversified attention, hierarchical features, semantic segmentation

CLC Number: