Loading...

Table of Content

    22 July 2024, Volume 53 Issue 6
    Smart Surveying and Mapping
    Hybrid computational paradigm and methods for intelligentized surveying and mapping
    Jun CHEN, Tinghua AI, Li YAN, Wanzeng LIU, Zhilin LI, Qiang ZHU, Jingxiang GAO, Hong XIE, Hao WU, Jun ZHANG
    2024, 53(6):  985-998.  doi:10.11947/j.AGCS.2024.20240131
    Asbtract ( )   HTML ( )   PDF (2089KB) ( )  
    Figures and Tables | References | Related Articles | Metrics

    The products from traditional digital surveying and mapping play an increasingly important role as the spatiotemporal foundation and key production element in the field of digital economy, digital governance, and digital life. However, their level of detail, update cycle, and service mode struggle to meet the high-quality development demands of the new digital intelligence era. Thus, there is an urgent need to transition from digital to intelligentized surveying and mapping, constructing new spatiotemporal infrastructures to comprehensively enhance the supply of high-quality spatiotemporal information resources, high-level spatiotemporal analysis capabilities, and high-standard spatiotemporal knowledge services. Starting from the analysis of the necessity of combining natural intelligence and artificial intelligence in surveying and mapping, this paper firstly discusses the basic overview and connotation of intelligent computing for surveying and mapping. Then we propose a new paradigm of KDAS hybrid intelligent computation for intelligent mapping, and sort out the basic tasks of its construction. Finally, this paper systematically elaborates the research method of hybrid computation for the key tasks of intelligentized surveying and mapping including perception, cognition, expression and service, as well as pointing out the basic research directions for the construction of the intelligent knowledge system and the industrial upgradation empowered by hybrid computing.

    Intelligent perception measurement technology of autonomous UAV for unknown environment
    Li YAN, Yinghao ZHAO, Jicheng DAI, Bo XU, Hong XIE, Yuquan ZHOU
    2024, 53(6):  999-1012.  doi:10.11947/j.AGCS.2024.20230389
    Asbtract ( )   HTML ( )   PDF (6213KB) ( )  
    Figures and Tables | References | Related Articles | Metrics

    The development of intelligent surveying and mapping puts higher requirements for efficient, complete, and intelligent data collection, especially in GNSS-denied environments such as under-canopy, where traditional methods frequently struggle to achieve efficient and high-coverage measurements. To address the need for intelligent perception measurement of unknown environments, this paper introduces a novel intelligent perception measurement unmanned aerial vehicle (UAV) technology and framework, using a UAV as a mobile platform. This paper integrates visual online autonomous localization with global exploration path planning. Initially, the framework incorporates a novel visual-inertial odometry (VIO) online localization algorithm based on point and line features, which solves the initial pose estimation through feature extraction and matching of point and line features, and then high-precision pose information of the UAV is generated in real-time using factor graph optimization. Furthermore, to ensure efficient and high-coverage autonomous UAV measurements in unknown environments, this paper employs a global optimal exploration path planning method that considers multi-level information to determine the local exploration targets, and then generates high-quality exploration trajectories in real time through trajectory search and optimization algorithm. Moreover, the framework was validated by a customized UAV platform, the step-by-step comparison and overall real-world test demonstrates that the localization and space exploration methods designed and adopted by the framework have significant advantages compared with the current representative methods. In addition, it achieves efficient and high-coverage full autonomous measurements in GNSS-denied local under-canopy environments, which establishes a solid theoretical and framework foundation for the further development of online intelligent perception in unknown scenarios.

    Scale evaluation method for fragmented terrain vector data
    Wanzeng LIU, Xinpeng WANG, Tingting ZHAO, Xi ZHAI, Ran LI, Xiuli ZHU, Zhihao JIANG, Yunlu PENG, Ye ZHANG
    2024, 53(6):  1013-1024.  doi:10.11947/j.AGCS.2024.20240024
    Asbtract ( )   HTML ( )   PDF (2675KB) ( )  
    Figures and Tables | References | Related Articles | Metrics

    To address the issues of low efficiency and low reliability in manually determining the scale of terrain vector data, a terrain vector data scale evaluation method based on natural laws is proposed to achieve automation in terrain vector data scale determination in this paper. Based on the characteristics that the node density of terrain vector data at different scales changes with the scale, this method uses the principles of natural law and statistical knowledge to determine the theoretical values of node density intervals at different scales and different elements. For the terrain vector data to be identified, compare its node density with the node density interval of the known scale vector data. If it falls within the node density interval of the scale, it is inferred as the scale data, and then automatic scale determination is achieved. Finally, the feasibility of the evaluation method was verified using two elements: contour lines and water systems. The experimental results show that the accuracy of automatic scale determination of 1∶1 000 000, 1∶250 000 and 1∶50 000 terrain vector data based on the algorithm proposed in this paper reaches 93.97%, 94.04% and 92.47%, respectively, and the overall accuracy of scale determination of vector data reaches 93.21%, greatly improving the accuracy and efficiency of scale judgment, and providing technical support for geographic information security assurance.

    The technology and intelligent development of 3D line cloud reconstruction from multiple images
    Dong WEI, Xinyi LIU, Yongjun ZHANG
    2024, 53(6):  1025-1036.  doi:10.11947/j.AGCS.2024.20230447
    Asbtract ( )   HTML ( )   PDF (5705KB) ( )  
    Figures and Tables | References | Related Articles | Metrics

    As a collection of line segments, 3D line clouds shave distinct geometric structures and semantic information in each individual feature. They can serve as efficient guiding, controlling, and abstract representation elements in structured 3D reconstruction, compensating for the deficiencies in edge description and lack of initial structure in point clouds. These line clouds represent important structured features that can change the traditional “one-layer skin” 3D model (where different objects are mutually adherent, making spatial analysis and decision-making difficult). However, how to reconstruct useful line clouds from multi-view images and make effective use of them has always been a challenging problem in this field. This article reviews the development of 3D line clouds, introduces related reconstruction methods, and analyzes existing difficulties and shortcomings. Combining the background of transformation from digitization to intelligent surveying and mapping technology, it discusses the three questions of what to build, how to build, and how to use line clouds in real-world 3D scenarios. The article also introduces and prospects the intelligent development of line cloud reconstruction and application, hoping to provide a reference for researchers working on real 3D reconstruction and line clouds.

    A review of intelligent InSAR data processing: recent advancements, challenges and prospects
    Liming JIANG, Yi SHAO, Zhiwei ZHOU, Peifeng MA, Teng WANG
    2024, 53(6):  1037-1056.  doi:10.11947/j.AGCS.2024.20230440
    Asbtract ( )   HTML ( )   PDF (10840KB) ( )  
    References | Related Articles | Metrics

    With the continuous accumulation of massive SAR data and the rapid development of deep learning technologies, the era of intelligent InSAR is approaching, mainly characterized by big data analysis and artificial intelligence. This paper provides an overview of recent progress and development trend of InSAR data processing technologies with deep learning. Firstly, the mainstream InSAR data processing methods are briefly described, and their limitations in complex application scenarios are analyzed, in terms of monitoring accuracy, processing efficiency and automation level. Then, on base of introduction of the main deep learning networks used in InSAR data processing, including convolutional neural network (CNN), recurrent neural network (RNN) and generative adversarial network (GAN), we systematically review recent advancements of intelligent InSAR data processing, e.g. phase filtering, phase unwrapping, PS/DS target selection, atmospheric delay correction, deformation estimation and deformation prediction. Finally, we discuss challenges faced by intelligent InSAR data processing based on deep learning, and provides an outlook on future development trends.

    Research progress in the application of image semantic information in visual SLAM
    Chi GUO, Yang LIU, Yarong LUO, Jingnan LIU, Quan ZHANG
    2024, 53(6):  1057-1076.  doi:10.11947/j.AGCS.2024.20230259
    Asbtract ( )   HTML ( )   PDF (3018KB) ( )  
    Figures and Tables | References | Related Articles | Metrics

    Visual simultaneous localization and mapping (VSLAM) technology uses cameras as the primary sensor to capture image data and obtain the position and orientation of the carrier based on algorithms such as multi-view geometry and state estimation, while simultaneously constructing a map for navigation and localization. VSLAM is a key technology in autonomous driving, AR, VR, MR, intelligent robotics, and drone flight control. In recent years, with the increasing demand for intelligent navigation and localization in various industries, VSLAM, which was originally focused on geometric measurements, has gradually integrated a semantic understanding of the environment. Semantic information refers to concepts that can be directly perceived and understood by humans, and semantic information in images refers to information such as object contours, categories, and saliency. Compared to geometric structures and features, image semantic information is more temporally and spatially consistent and provides results that are closer to human perception. Introducing image semantic information into visual SLAM can not only promote the performance of each module of the system, but also enhance the intelligent perception ability of VSLAM, forming a semantic VSLAM that integrates multiple functions such as geometric measurement, localization, and environment understanding. In this article, based on the application of image semantic information, we summarize the classic solutions and the latest research progress in semantic VSLAM. Based on this, we summarize the existing problems and challenges in visual semantic SLAM and propose future research directions in this field to further promote its development towards intelligent navigation and localization.

    Prediction and interpolation of GNSS vertical time series based on the AdaBoost method considering geophysical effects
    Tieding LU, Zhen LI
    2024, 53(6):  1077-1085.  doi:10.11947/j.AGCS.2024.20230434
    Asbtract ( )   HTML ( )   PDF (4188KB) ( )  
    Figures and Tables | References | Related Articles | Metrics

    Traditional GNSS vertical time series prediction and interpolation methods only consider time variables and have obvious limitations. This study takes into account the impact of geophysical effects and constructs a regression problem using temperature, atmospheric pressure, polar motion, and GNSS vertical time series data, uses the adaptive boost (AdaBoost) algorithm for modeling. To verify the prediction and interpolation performance of the model, the vertical time series from 4 GNSS stations were selected for analysis. The modeling experiment shows that compared to the Prophet model, the fitting accuracy of AdaBoost model has been improved by 35%. The prediction results indicate that within a 12 month prediction period, the MAE values of the AdaBoost model at four GNSS stations are approximately 4.0~4.5 mm, and the RMSE values are approximately 5.0~6.0 mm. The interpolation experiment shows that compared to the cubic spline interpolation method, the accuracy of AdaBoost interpolation model has been improved by about 15%~28%. Our experiments have shown that the AdaBoost model considering geophysical effects can be applied to the prediction and interpolation of GNSS vertical time series.

    Knowledge-guided dynamic generation of escape route networks for forest fires
    Jun ZHU, Peijing CHEN, Chao ZENG, Quanhong ZHENG, Yakun XIE, Jigang YOU, Huijie LIAN
    2024, 53(6):  1086-1097.  doi:10.11947/j.AGCS.2024.20230234
    Asbtract ( )   HTML ( )   PDF (4671KB) ( )  
    Figures and Tables | References | Related Articles | Metrics

    A reasonably planned forest fire escape road network plays an important role in emergency escape decision-making, but the existing methods have weak dynamic adaptability and do not consider the key spatial information affecting people's escape safety such as ravine areas and narrow ridges, resulting in poor accuracy of escape road network planning. Therefore, this paper introduces intelligent mapping technology methods and proposes a knowledge-guided dynamic generation method of forest fire escape road network, by breaking through the key technologies of forest fire escape road network planning knowledge map construction, key spatial region extraction, et al. Then, it establishes a forest fire escape road network access raster network model, realizes the dynamic optimization generation of escape road network with the improvement of the A* algorithm, develops a prototype system and carries out experimental analysis. The results show that the method in this paper can realize the dynamic generation of escape road network under the environment of forest fire spreading, which can provide effective escape decision-making information for the fire fighters. Compared with the existing static forest fire escape road network planning methods, the accuracy of escape planning improves the overlap rate of the high safety zone by 3.06%, and the overlap rate of the hazardous zone of the escape network reduces by 27.39%.

    Practical framework and methodology for high-performance intelligent invariant detection in remote sensing imagery
    Xiaogang NING, Hanchao ZHANG, Ruiqian ZHANG
    2024, 53(6):  1098-1112.  doi:10.11947/j.AGCS.2024.20230405
    Asbtract ( )   HTML ( )   PDF (5328KB) ( )  
    Figures and Tables | References | Related Articles | Metrics

    In addressing the challenges posed by sample category imbalance, limited algorithm applicability, and inadequate knowledge application inherent in traditional change detection techniques, we propose a novel framework for high-reliability, intelligent invariant detection of land classes in remote sensing imagery. This framework employs advanced algorithms to precisely extract stable invariant areas that are typically irrelevant to various tasks, thereby reducing the operational footprint and boosting productivity in practical settings. Commencing with data preprocessing, a sample library tailored to the specifics of invariant detection is developed. Additionally, we introduce a method for invariant detection that utilizes prior information to guide the discrimination between global and local pseudo-changes. This approach leads to the creation of a gridded invariant mask and the introduction of two object-level metrics—compression accuracy and compression range—to assess the framework's performance in terms of accuracy and efficiency. Empirical validation across multiple national regions confirms that this framework not only minimizes the workload associated with manual visual interpretation but also significantly improves the efficiency of data extraction, thus offering a groundbreaking solution for extracting change information from remote sensing data in real-world scenarios.

    A general progressive decomposition long-term prediction network model for high-speed railway bridge pier settlement
    Xunqiang GONG, Hongyu WANG, Tieding LU, Wei YOU
    2024, 53(6):  1113-1127.  doi:10.11947/j.AGCS.2024.20230387
    Asbtract ( )   HTML ( )   PDF (6784KB) ( )  
    Figures and Tables | References | Related Articles | Metrics

    Uneven settlement of high-speed railway bridge pier is one of the potential causes leading to track irregularities. Accurately predicting settlement of bridge pier is of significant importance for ensuring the reliability and safety of railway construction and operation. Most conventional time series prediction models are tested only on well-preprocessed datasets without missing values. However, in real-world scenarios of high-speed railway bridge pier settlement, the data are characterized by infrequent and irregular observation intervals and complex, variable settlement patterns, posing challenges for long-term prediction. To address this, we introduce the general progressive decomposition long-term prediction network (GPDLPnet), which abandons traditional preprocessing concepts and embeds the preprocessing phase within the network structure, achieving progressive preprocessing during training. In each iteration, GPDLPnet uses an improved diagonally-masked self-attention (IDMSA) module to analyze missing patterns in the settlement data, then decomposes and reconstructs the data into high-frequency, low-frequency, and trend sub-components through an improved complete ensemble empirical mode decomposition with adaptive noise (ICEEMDAN) module. These sub-components serve as feature inputs for the BiLSTM-RSA-Resnet prediction module, which outputs recursive predictions, thus enabling long-term prediction of high-speed railway bridge pier settlement. Utilizing real-world engineering data, experiments under two typical observation modes, high-frequency and low-frequency, are conducted. GPDLPnet demonstrates excellent predictive performance over a 3-4 month, surpassing seven other models in accuracy indexes.

    Step-like displacement prediction of landslides guided by deformation mechanism
    Yanan JIANG, Linfeng ZHENG, Qiang XU, Minggao TANG, Xing ZHU
    2024, 53(6):  1128-1139.  doi:10.11947/j.AGCS.2024.20230463
    Asbtract ( )   HTML ( )   PDF (3424KB) ( )  
    Figures and Tables | References | Related Articles | Metrics

    Rainfall reservoir-induced landslides in the Three Gorges Reservoir (TGR), China, exhibit distinctive step-like deformation characteristics, involving mutation and creep states. These particular features pose a challenge for accurate early warning and prediction. Previous landslide displacement forecasting models have shown limited prediction accuracy, particularly when it comes to mutational displacements. The proposed prediction model in this study, based on Informer, utilizes a multi-head attention mechanism to capture temporal dependencies and incorporates pooling layers for emphasizing crucial features, enabling adaptive learning of feature weights and more effective extraction of periodic information from time series data. The Baishuihe landslide was used for case studies with monitoring data collected from July 2013 to December 2018, including monthly displacements, daily rainfall and reservoir water level. Firstly, cumulative displacement was decomposed into trend displacement and periodic displacement by the variational mode decomposition (VMD). After triggering factors selection and decomposition, the double exponential smoothing (DES) method and the Informer model are used to predict the trend and periodic component displacements, respectively. Finally, the predicted trend and periodic components are combined to generate the cumulative displacement prediction. Results demonstrate that the proposed model achieves impressive results with a root mean square error of 12.21 mm, a mean absolute error of 10.05 mm, and a coefficient of determination of 0.99 for the next 27 months' cumulative displacement prediction. Compared to other four mainstream models, this approach exhibits higher prediction accuracy, particularly in predicting the rapid deformation phase of step-like bank landslides. Consequently, it holds significant credibility and practical value in the early warning research of rainfall reservoir-induced landslides.

    Intelligent site selection method for UAV-dropped GNSS landslide monitoring equipment
    Hao XU, Qin ZHANG, Li WANG, Bao SHU, Yuan DU, Guanwen HUANG
    2024, 53(6):  1140-1153.  doi:10.11947/j.AGCS.2024.20230358
    Asbtract ( )   HTML ( )   PDF (6487KB) ( )  
    Figures and Tables | References | Related Articles | Metrics

    The GNSS technology is widely used in landslide deformation monitoring. However, in complex and hazardous landslide areas, it is difficult for personnel to access, and the installation of GNSS monitoring devices faces challenges. The UAV-dropped deployment technology offers a potential solution to this problem, but it requires appropriate target delivery locations for the UAVs. Traditional site selection methods mainly rely on expert field surveys, which fail to meet the requirements of such scenarios. To address this, this study first utilizes drone aerial photography and InSAR-Stacking technology to obtain digital surface models (DSM), digital orthophoto maps (DOM), and surface deformation rate maps of the target site. Then, based on deep learning, terrain analysis, and other methods, the key site selection factors such as historical deformation, crack distribution, slope, surface roughness, vegetation index, and slope direction are extracted. Finally, an analytic hierarchy process is employed to intelligently evaluate the suitability of different locations within the landslide area for UAV-dropped deployment of GNSS monitoring devices and recommend the coordinates of the target delivery locations. The site selection experiments were conducted in the Heifangtai landslide area in Gansu province, China. The suitability of the selected locations within this area was assessed, and four airdrop positions for GNSS monitoring devices were recommended. The effectiveness of the proposed method was validated through on-site observations and historical station deformation sequences. This method comprehensively considers the demands of deformation monitoring, deployment difficulty, observation conditions, and continuous operation, enabling efficient evaluation of the suitability of equipment deployment in the site selection area. It holds significant reference value for the unmanned and intelligent deployment of GNSS monitoring devices.

    Identification of loess landform types jointly affected by contour morphological knowledge and the graph neural network
    Bo KONG, Tinghua AI, Min YANG, Hao WU, Huafei YU, Tianyuan XIAO
    2024, 53(6):  1154-1164.  doi:10.11947/j.AGCS.2024.20230445
    Asbtract ( )   HTML ( )   PDF (5538KB) ( )  
    Figures and Tables | References | Related Articles | Metrics

    Landform type identification is a complex decision-making problem jointly affected by multi-factors. Due to the extensiveness and differences of landform regional environments and the complexity of the roles of geological elements, it is not possible to obtain satisfactory results by simply introducing artificial intelligence (AI) methods and supervising learning through typical samples. Thus, this study tries to integrate the knowledge of contour morphology as the natural intelligence in surveying and mapping into AI technology and carries out the research on loess landform type identification by hybrid intelligence integrating landform sample training and landform morphological representation rules. This paper presents a landform type recognition method that integrates contour morphological knowledge with the graph neural network (GNN). In this method, the contours of the landform unit are modeled as a graph structure composed of nodes and connecting edges, and the extracted contour vertex morphology knowledge is embedded in the graph nodes. A GNN model with pooling operations is used to mine high-level features and context information in the graph structure to identify unit types. The experimental results demonstrate the effectiveness of the proposed approach in identifying loess landform types, achieving an F1 score of 86.1% on the test dataset, which represents a 3.0%~8.2% improvement over the two comparative methods.

    Dynamic construction of high-resolution remote sensing image sample datasets and intelligent interpretation applications
    Haiyan GU, Yi YANG, Haitao LI, Lijian SUN, Shaopeng DING, Shiqi LIU
    2024, 53(6):  1165-1179.  doi:10.11947/j.AGCS.2024.20230469
    Asbtract ( )   HTML ( )   PDF (10121KB) ( )  
    Figures and Tables | References | Related Articles | Metrics

    In the era of artificial intelligence, the interpretation of remote sensing images is moving towards automation and intelligence. High-quality sample datasets are crucial for this development. A massive amount of high-quality spatio-temporal geospatial information and derived products has been accumulated in China, which serve as an important source for deep learning-driven intelligent interpretation of remote sensing images. Leveraging existing data resources can promote the depth and breadth of artificial intelligence and remote sensing interpretation applications. Aiming at the limitations of existing sample datasets, such as regional restrictions, limited timeliness and sample types, this paper proposes a dynamic construction technique for a high-resolution remote sensing image intelligent interpretation sample datasets based on existing data resources. Firstly, a business-driven framework for sample datasets demand generation, dynamic construction, and intelligent application is proposed based on the characteristics analysis of publicly available sample datasets for feature extraction, land cover classification, and change detection. Secondly, this research investigates methods and implementation processes for sample generation based on historical interpretation results, as well as sample refinement through SAM (segment anything model) large model prompt learning-guided cleaning. Furthermore, a sample dataset with regional, temporal, scale, multi-sensor, and multi-type features is designed, along with spatial-temporal-land cover relationships. The study explores the dynamic reconstruction process of sample dataset quantification-retrieval-combination, achieving dynamic management of spatiotemporal samples and multidimensional retrieval. Finally, intelligent interpretation applications such as land cover classification, feature extraction, and change detection are conducted to validate the feasibility of the proposed methods. The aim is to promote the effective utilization of sample datasets based on existing foundational data, as well as the interconnection of sample construction-management-application and data-model-business, providing reference ideas for the construction and application of high-resolution remote sensing image intelligent interpretation sample datasets.

    Local clarity estimation and adaptive segmentation of line features based on positive and negative kernel density curves
    Xiaoqiang CHENG, Na LIU
    2024, 53(6):  1180-1194.  doi:10.11947/j.AGCS.2024.20230287
    Asbtract ( )   HTML ( )   PDF (5768KB) ( )  
    Figures and Tables | References | Related Articles | Metrics

    In line simplification, segmentation of map line features according to differences in morphological features is the key to rational use of simplification methods. The existing segmentation methods are mainly based on vertices to analyze the morphological heterogeneity of line features and ignore the morphological changes of line features expressed at different scales. The fuzzy parts that cannot be clearly identified in the heterogeneous line features will change with the different scales. Based on this, a method for segmenting line features based on clarity changes is proposed in this paper. Firstly, the raster pattern of line features is generated at a specific scale, and the raster line pixels are classified into three types of pixels: single-boundary pixels, double-boundary pixels, and internal pixels; single-boundary pixels and internal pixels, which affect visual discrimination; the mapping relationship between the three types of pixels and the original vector line is established, and two groups of data points are obtained: adhesive vertices, which correspond to the blurred parts of the line features, and normal vertices, which correspond to the clear parts of the line features; and the aggregation analysis of the two groups of data points based on the kernel density, and the clustering analysis is generated, and generate the positive kernel density curve which indicates the change of line feature clarity and the negative kernel density curve which indicates the change of line feature blurring degree under this scale; finally, analyze the characteristics of the intersection of two kernel density curves to get the segmentation point which divides the clarity and blurred parts of the line feature and complete the segmentation of the line. By comparing the segmentation results with manually segmented results, it is evident that the segmentation results of this paper are generally consistent with the human eye's identification of fuzzy and blurred parts of the line features.

    High-resolution optical images change detection based on global information enhancement by pyramid semantic token
    Daifeng PENG, Chenchen ZHAI, Dingwei ZHOU, Yongjun ZHANG, Haiyan GUAN, Yufu ZANG
    2024, 53(6):  1195-1211.  doi:10.11947/j.AGCS.2024.20230415
    Asbtract ( )   HTML ( )   PDF (8712KB) ( )  
    Figures and Tables | References | Related Articles | Metrics

    Due to the influence of complex background and spectral changes, missing detection of small objects and incomplete detection of geometric structures and details easily arise in remote sensing change detection (CD) domain. To address these issues, this paper proposes a pyramid semantic token guided global information enhancement change detection network (PST-GIENet) by combining the advantages of convolutional neural network (CNN) and Transformer network. Firstly, ResNet18 network without max-pooling layer is adopted to generate bi-temporal deep features, which are fused and refined by joint attention mechanism and deep supervision strategy. Secondly, image features are represented as multi-scale semantic token through spatial pyramid pooling, a Transformer encoder-decoder is subsequently employed to model the global context of the fused features. Finally, change map is produced through a layer-wise up-sampling decoder. To verify the effectiveness of the proposed method, extensive experiments and analysis were conducted on three publicly available CD datasets, including LEVIR-CD, CDD, and WHU-CD. The quantitative results showed that PST-GIENet achieved the highest metric scores in all the three datasets, with F1 scores of 91.71%, 96.16%, and 94.08%, respectively. In addition, visual results indicate that PST-GIENet can effectively suppress the interference from complex backgrounds and spectral distortions, which significantly enhances the network's ability to capture edge structures and multi-scale changes of ground objects, achieving the best visual performance.

    Multi-level contrastive learning for weakly supervised extraction of urban solid wastes dump from high-resolution remote sensing images
    Jicheng WANG, Anmei GUO, Li SHEN, Tian LAN, Zhu XU, Zhilin LI
    2024, 53(6):  1212-1223.  doi:10.11947/j.AGCS.2024.20230543
    Asbtract ( )   HTML ( )   PDF (5023KB) ( )  
    Figures and Tables | References | Related Articles | Metrics

    Urban solid waste is a major pollutant in the urbanization process that endangers the urban ecology and public health. Intelligent interpretation of high-resolution satellite imagery for solid waste dumps (SWD) is crucial for automated monitoring. However, deep learning-based automatic extraction methods for SWD heavily rely on costly and labor-intensive high-quality pixel-level annotations. This paper presents a weakly supervised method that only uses image-level annotations to perform pixel-level SWD extraction. The method leverages the image characteristics of SWD and applies contrastive learning at both pixels, image levels under constraints of scale contrast to improve the class activation maps (CAMs) of SWD. Based on the CAMs, the method generates high-quality pixel-level pseudo-labels that are used to train the SWD extraction model. The experiments on a self-created SWD dataset demonstrate that the proposed method achieves an F1 score of 71.58% and an IoU score of 55.74%, which are significantly higher than the baseline methods. This shows that the multi-level contrastive learning-based weakly supervised method can produce more complete and accurate CAMs of SWD, leading to better extraction performance.

    Building change detection method combining object feature guidance and multiple attention mechanism
    Shaopeng DING, Xiushan LU, Rufei LIU, Yi YANG, Haiyan GU, Haitao LI
    2024, 53(6):  1224-1235.  doi:10.11947/j.AGCS.2024.20230436
    Asbtract ( )   HTML ( )   PDF (5110KB) ( )  
    Figures and Tables | References | Related Articles | Metrics

    High-resolution remote sensing images have rich detail features, and building changes are of variable types with large scale differences. Aiming at the problem that building changes are prone to voids and omissions in complex environments, a building change detection method combining object feature guidance and multiple attention mechanism is proposed to realize fine change information extraction from high-resolution images by enhancing category information through building target-level guidance. The method consists of a building significant enhancement module and a target-guided multi-attention module, which extracts the key areas of the building through global deep feature perception and fusion, combines the target-level feature guidance and multiple self-attention to strengthen the feature expression, enhances the contextual feature correlation, effectively reduces the loss of detailed features, and solves the problem of loss of details caused by the target voids and unclear edges. It is shown through two sets of experiments that this method can improve the accuracy, effectively reduce the change loss in scenes with more kinds of changes, and improve the stability of the algorithm.

    Location and rapid detection method of water leakage in subway tunnels based on mobile laser scanning
    Changqi JI, Zhaojie GUO, Haili SUN, Ruofei ZHONG
    2024, 53(6):  1236-1250.  doi:10.11947/j.AGCS.2024.20230438
    Asbtract ( )   HTML ( )   PDF (6321KB) ( )  
    Figures and Tables | References | Related Articles | Metrics

    Water leakage is one of the most important diseases in subway tunnels and can also cause other structural diseases. It is of great significance to carry out research on detection methods of water leakage diseases in subway tunnels. This paper focuses on the problem of water leakage in subway tunnels and proposes a water leakage location and detection method based on mobile laser scanning point cloud data. First, combined with the mobile laser scanning detection method, research on the precise positioning method of tunnels was carried out. Then, the YOLOv7 model was improved, and the ConvNeXt network and CBAM module were introduced to enable the model to better capture multi-scale, multi-abstraction level features and enhance attention to the key features of seepage water. The GIoU Loss function was used to make the model can better handle incomplete leaky water boxes. It uses the Soft-NMS weighted average method when predicting to retain more bounding boxes, thereby improving detection accuracy. The efficiency and robustness of the method in this paper are verified by combining the shield method and mine method tunnel leakage data sets constructed with laser scanning data obtained in Chongqing metro. The ablation test shows that compared with the baseline model, the method in this paper has achieved significant performance improvements on different data sets. In the shield method data set, the precision rate P is increased by 8.1%, and the recall rate R is increased by 4%. In the mining law data set, the precision rate P increased by 8.6%, and the recall rate R increased by 6.8%. At the same time, compared with the mainstream target detection algorithms faster RCNN (Swin), faster RCNN (ConvNeXt), and YOLOv8, this method shows advantages in both accuracy and speed. Finally, this paper shows the location and detection results of water leakage in some tunnels to verify the practicability of this method.