As the most widely used low-cost GNSS terminals at present, smartphones are limited by the physical characteristics of their built-in linearly polarized antennas. Their GNSS signal reception is vulnerable to occlusion interference from complex environments such as urban buildings and trees, leading to significant issues in observations including prominent multipath effects and poor carrier phase continuity, which in turn cause a substantial degradation in positioning accuracy. To address this problem, this study takes multi-source observation data opened by the Android system as the core input, including GNSS raw measurement information, attitude yaw angle and velocity derived from inertial sensors, as well as quality indicators such as pseudorange residuals and position dilution of precision (PDOP) during positioning calculation. Dynamic data collection was conducted in typical complex urban scenarios, and a prediction and correction framework for the 3D real-time kinematic (RTK) positioning error of smartphones was constructed. Considering the spatiotemporal correlation and multi-feature coupling characteristics of positioning errors, this paper proposes a convolutional long short-term memory (CNN-LSTM) neural network integrated with a channel-spatial dual attention mechanism: the convolutional layer extracts the spatial correlation of multi-source features, the LSTM layer captures the temporal dependence of error sequences, and the dual attention mechanism enhances the weights of key satellite channels and core observation features respectively, thereby achieving accurate modeling of error patterns in complex environments. Based on tests conducted with two smartphones of different hardware configurations, the Xiaomi Mi 8 and Google Pixel 6, under asymmetric transition occlusion environments and severe occlusion environments, the results indicate that the positioning accuracy of the Mi 8 was improved by approximately 49.3% and 63.9%, respectively, while the Pixel 6 achieved improvements of 37.5% and 47.1%, respectively. These results verify the universality and effectiveness of the method across different hardware terminals and complex scenarios, providing a lightweight algorithmic support for high-precision positioning of smartphones.
In GNSS applications, there exist unmodeled errors that cannot be effectively compensated by differencing and linear combination, empirical model correction, or traditional parameterization. These errors limit the precision and credibility of positioning, navigation, and timing (PNT). By leveraging a prior and posterior information as well as multiple indicators, and driven by both models and data, we design overall and source-specific detection methods for unmodeled errors and develop a complete significance testing procedure. Considering practical observation scenarios, we propose a resilient compensation method based on functional and stochastic models, encompassing strategies such as negligible, corrected, fixed, float, and weighted models for unmodeled errors. In combination with data quality control strategies, we further develop stochastic characterization and quality control methods considering the unmodeled errors. Experimental results demonstrate that the proposed real-time processing of GNSS unmodeled errors can effectively improve GNSS positioning accuracy and credibility. This work provides technical support for credible PNT under complex conditions and enriches the theoretical and methodological system of geodetic data processing.
Multi-sensor fusion is a core terminal technology in the national comprehensive positioning, navigation, and timing (PNT) system. However, existing multi-sensor fusion frameworks often require pre-configuration and task-specific customization, making it difficult to achieve rapid and seamless reconfiguration in multi-scenario and multi-task PNT applications. Users are therefore forced to repeatedly adjust both PNT software and hardware settings. To address this issue, this paper proposes a unified and general multi-sensor fusion framework, termed componentized PNT. The framework theoretically supports the flexible integration of sensors of arbitrary types and quantities in a blind-plug-and-play manner, fundamentally enhancing the adaptability of PNT systems in complex environments. Based on an edge-computing architecture, the framework adopts standardized hardware and software interfaces, deploying the sensor-specific measurement modeling module on the edge sensor side and the sensor-shared fusion module on the central PNT platform. The interaction between the two modules is standardized through a unified mathematical representation, thereby overcoming the inherent limitation of traditional frameworks that require sensor pre-configuration on the PNT platform. Based on a self-developed prototype for the componentized PNT system, real-time experiments were conducted under complex scenarios involving sensor connection, removal, and replacement. The results validate the blind-plug-and-play capability of the proposed system, as well as its autonomous reconfiguration ability during multi-domain transitions and dynamic sensor switching. Experimental results show that as the number and diversity of sensors increase, the componentized PNT system achieves immediate improvements in real-time positioning performance, exhibiting high operational efficiency during transitions between different scenarios.
Multi-sensor fusion effectively mitigates the limitations of individual navigation techniques, significantly enhancing the availability and reliability of navigation systems in complex environments. Conventional approaches for navigation parameter estimation, such as Kalman filtering and least-squares optimization, are predominantly based on discrete-time state models. These methods exhibit inherent limitations when dealing with asynchronous, multi-rate, and high-frequency sensor data. This paper proposes a continuous-time state estimation framework that employs uniform B-splines to parameterize the motion trajectory of the vehicle. By doing so, the original discrete pose estimation problem is transformed into the estimation of B-spline control points. The continuous-time analytical expression of the trajectory enables direct derivation to establish observation models that relate inertial measurement unit (IMU) outputs (i.e., acceleration and angular velocity) to the spline control points, thereby avoiding the precision degradation and noise property alterations associated with traditional inertial navigation system (INS) integration. Similarly, measurements from other sensors, such as position and velocity from GNSS, are also formulated as observation equations with respect to the spline control points, leading to a unified optimization problem. Simulation experiments demonstrate that the proposed method yields pose estimates with reduced high-frequency noise influence and improved smoothness. Specifically, it achieves a 41.4% improvement in positioning accuracy and a 35.0% improvement in attitude determination accuracy. During GNSS signal outages, the new approach shows slower error growth, reducing the maximum horizontal positioning error from 17.4 cm to 7.29 cm. In segments with missing IMU data, the method eliminates the need for data interpolation, enhancing positional accuracy by nearly twice and attitude accuracy by nearly fourfold. Experiments using real-world datasets further validate the effectiveness of the proposed method, showing that it maintains high performance even in complex environments, with positioning accuracy improved by 30.0% and attitude accuracy improved by 69.6%.
High-precision BeiDou/GNSS signals provide users with reliable positioning information. However, achieving high-accuracy, robust autonomous positioning remains a core challenge for mobile robot systems when these signals are obstructed and unavailable. Although significant progress has been made in LiDAR-inertial odometry (LIO) algorithms in recent years, issues such as observation degradation and drift frequently arise in scenarios involving sparse point clouds, field-of-view obstructions, and large-angle rotations, severely impacting system stability and positioning accuracy. Ultra-wideband (UWB) ranging, as a low-power, low-cost absolute positioning technology, offers advantages such as low latency and interference resistance. However, it is highly susceptible to non-line-of-sight (NLOS) errors. Against this backdrop, LiDAR provides high-frequency geometric feature information, IMU maintains short-term continuity, and UWB offers global positional constraints, demonstrating clear complementary strengths among the three. Nevertheless, constructing a targeted fusion framework that leverages sensor strengths to achieve complementary observations and error suppression remains a key challenge in multi-source sensor fusion. To address these issues, this paper designs a tightly coupled LiDAR/UWB/INS model based on the iterated error-state robust Kalman filter (IESRKF), enabling unified modeling of observation residuals and adaptive robust estimation across heterogeneous sensors. By incorporating UWB ranging for absolute position constraints and employing multi-iteration linear optimization, the stability of state convergence is enhanced, effectively mitigating error accumulation and drift in LIO systems during high-angle rotation scenarios. Experimental results demonstrate that the proposed ULIO-IESRKF algorithm maintains superior performance compared to traditional LIO and ULIO-LC algorithms on paths with large rotations and severe obstructions. Furthermore, the multi-iteration linear optimization mechanism effectively mitigates modeling errors introduced by first-order linearization. Compared to the LIO algorithm, the ULIO-IESRKF algorithm achieves improvements of 29.15%, 42.42%, and 30.37% in the E, N, and U directions, respectively. It enhances planar positioning accuracy by 38.00% and improves accuracy in Pitch, Roll, and Yaw by 10.15%, 6.70%, and 34.80%, respectively. Experimental results fully validate that this algorithm achieves high positioning accuracy while demonstrating strong robustness and dynamic adaptability.
How to effectively resolve the contradiction between the security protection and application of geographic information is a common challenge faced by countries worldwide. For a long time, due to the lack of a solid theoretical foundation for geographic information security technology in China, its technical logic has failed to effectively support the management logic, exacerbating the contradiction between security control and application of geographic information. To address this issue, based on the theory of information uncertainty, this paper proposes the technical logic of geographic information security by utilizing the unique attributes of information and combining the basic elements of state secrets. It holds that geographic information is a complex integrated entity intertwined with certainty and uncertainty internally and externally. The dynamic fluctuation of its inherent uncertainty causes changes in the importance and reliability of geographic information, triggering the mutual conversion between classified and unclassified geographic information, which constitutes the inherent logic of geographic information security. Furthermore, this paper clarifies the technical logic of classification, declassification and assessment of geographic information. Finally, it points out the path for the technical logic of geographic information security to support the management logic.
Building shape recognition and classification serve as pivotal components in urban spatial analysis, intelligent cartography, and 3D city modeling, playing a critical role in high-precision map construction and smart city governance. However, existing approaches still face significant challenges when dealing with the complex and diverse morphologies of buildings. Methods based on handcrafted geometric features often suffer from limited generalization capability, while rasterization-based approaches tend to introduce geometric distortions and struggle to preserve the topological structure and fine details of vectorized building outlines. To address these issues, this paper proposes an end-to-end automatic feature learning framework tailored for vectorized building contours. The method represents building outlines as corner-point graph structures to achieve a structured representation of building shapes. Furthermore, a dedicated structural feature module is integrated into a graph convolutional network (GCN) to effectively capture morphological characteristics such as concave-convex transitions and branching patterns, thereby enhancing the model's discriminative power for complex geometric structures. Experimental results demonstrate that the proposed method achieves classification accuracies of 99.20% and 99.03% on public and extended datasets, respectively, with Kappa coefficients exceeding 0.989, indicating superior performance. Error analysis reveals that misclassifications primarily occur between geometrically or topologically highly similar categories (e.g., E/U-shaped or X/O-shaped buildings), suggesting room for improvement in fine-grained structural modeling and global semantic integration. This study provides a robust, fully automatic, and generalizable solution for high-precision recognition of vectorized building contours without reliance on manual feature engineering.
Least squares collocation (LSC) is a key theory for processing geophysical and geodetic data. However, its application to large-scale observations is constrained by the substantial computational and storage costs required to construct and invert a dense covariance matrix whose size scales with the number of observations. To overcome this bottleneck, we propose a greedy-algorithm-based sparse approximation method for LSC. Specifically, LSC is reformulated within a sparse-representation framework as a sparse coefficient recovery problem, which is then solved iteratively using matching pursuit (MP) and orthogonal matching pursuit (OMP), thereby avoiding explicit factorization and inversion of the full dense covariance matrix. A case study on geoid modelling using multi-source gravity data in Colorado, USA, shows that the proposed method achieves an external validation accuracy (2.24 cm) comparable to that of LSC (2.27 cm). Moreover, the resulting sparse model stores only 3.9% of the coefficients, leading to more than a 25-fold improvement in gridded prediction efficiency and enabling model lightweighting. Further semi-simulation statistical experiments indicate that, when covariance parameters estimated from noisy data aggravate model mismatch, the greedy collocation scheme exhibits more robust statistical behavior. Overall, the proposed method enhances scalability while maintaining modelling accuracy, providing a computationally feasible solution for high-precision gravity field modelling under large-scale observations.
Addressing the challenges of high feature mismatch rates and poor robustness in laser point cloud registration for geological hazard monitoring, as well as difficulty in distinguishing subtle deformations from vegetation interference and environmental noise, this paper proposes a millimeter-level surface deformation detection method driven by coarse-to-fine high-precision laser point cloud registration. First, graph theory is used to eliminate gross errors in feature matching, and an enhanced GNC-Welsch robust estimator is designed to prevent coarse registration from falling into local minima. High-precision registration is then achieved using microstructure-based hybrid feature factors. A “precise extraction-multi-dimensional verification” strategy is proposed: feature analysis filters vegetation interference and extracts candidate deformation regions, while a multi-dimensional verification framework integrating geometric, statistical, and physical features eliminates false deformations. Simulation experiments show the method maintains RMSE of 0.52~0.61 mm across different deformation magnitudes, achieving F1 scores of 86.11% and 95.39% at deformation magnitudes of 5 mm and 8 mm respectively, validating its effectiveness for millimeter-level deformation detection. Real slope experiments confirm the algorithm effectively rejects false deformation clusters and accurately identifies 15 mm ground subsidence and 22.6 mm ground uplift, validating its robustness and practicability in complex field environments.
Labelled data are essential for road extraction from optical images; however, creating high-quality labels is labor-and time-intensive. Moreover, the transferability of network models across different regions, sensors, and imaging times is limited, restricting their broader application in spatio-temporal contexts. To address this issue, we propose a road extraction method for heterogeneous data using sparse labels that combines optical imagery with OpenStreetMap (OSM) data. Sparse road labels are generated through raster processing and coordinate alignment with OSM vector data. Then, the segment anything model (SAM) and simple linear iterative clustering (SLIC) are integrated to extract multi-level image features, thereby facilitating label dissemination through object-level processing for initial optimization. Finally, a network model was trained using both optical images and rough optimization results, and it refined the label accuracy via image-label association mapping and was further optimized with OSM data as a buffer. Experimental validation using the RoadNet and Oklahoma datasets in conjunction with the four semantic segmentation networks UNet, D-LinkNet, MANet and UNetFormer demonstrated that our proposed method outperforms existing methods in terms of both quantitative accuracy and performance, especially in challenging areas of road extraction.
The visual appearance of land cover objects in high-resolution optical remote sensing images is significantly influenced by seasonal evolution and regional differences. Enhancing the spatio-temporal controllability of generation models to reproduce object features under specific spatio-temporal contexts accurately remains a critical challenge. Existing research has limitations in the encoding method of multi-source spatio-temporal information, as well as the interaction guidance method between encoded spatio-temporal features and visual features, making it difficult to accurately model the precise mapping between spatio-temporal conditions and the visual appearance of land cover objects. To address this problem, this paper proposes a framework for spatio-temporal controllable high-resolution optical remote sensing image generation. First, a multi-source spatio-temporal information encoding strategy considering attribute differences is designed, which utilizes heterogeneous frequency encoding and independent projections to transform diverse spatio-temporal information into accurate and decoupled feature representations, thereby modeling the unique properties of diverse spatio-temporal information. Second, an interaction guidance mechanism between spatio-temporal features and visual features based on decoupled attention is designed. This mechanism employs an independent parallel attention branch to facilitate deep interaction between spatio-temporal features and visual features, effectively leveraging the constraining role of spatio-temporal information without interfering with text-guided generation. We adopt low-rank adaptation to efficiently transfer domain knowledge by optimizing only low-rank decomposition matrices, thereby preserving the pre-trained generative priors of the base model. Experiments on a large-scale dataset covering seven typical regions in China demonstrate that the proposed method outperforms state-of-the-art methods by 46.69% and 14.67% in terms of spatio-temporal distribution consistency and structural-textural consistency, respectively. These results confirm the controllability and generalization potential of the proposed framework across diverse spatio-temporal scenarios.
The indoor space topological model serves as the fundamental data basis for precise navigation and personalized location services. However, the hierarchical complexity and flexible topological characteristics of large-scale indoor public spaces pose limitations for existing methods in multi-floor space partitioning and topological relationship representation. To address this, this paper proposes a semantic-assisted method for constructing multi-floor indoor spatial topology models from 3D point clouds. First, semantic information from 3D point clouds is leveraged to assist traditional height histogram models and watershed segmentation, achieving an initial multi-level “floor-room” partitioning of indoor subspaces. Next, the local connectivity between adjacent subspaces is quantitatively described using semantic constraints such as walls, floors, and ceilings. A 3D Markov random field (MRF) model is then employed to optimize the preliminary over-segmented subspaces, resulting in locally continuous and globally consistent multi-floor indoor space partitioning. Finally, doors and staircases are used to reconstruct intra-floor and inter-floor subspace connectivity, respectively, and a complete 3D indoor topological model is established based on the IndoorGML standard. Experimental results demonstrate that the proposed method achieves recall, mean intersection over union (mIoU), and precision of 97.87%, 96.05%, and 98.09% for floor segmentation, and 89.87%, 85.66%, and 94.81% for room segmentation, with a normalized graph edit distance (nGED) of 81.35% for topological relationship representation. Compared with existing state-of-the-art methods, the proposed approach improves the average precision and recall of space partitioning by approximately 9.60% and 9.30%, respectively, and enhances nGED by over 2.49%, demonstrating greater stability and reliability.
Heterogeneous change detection using optical and SAR imagery is of great significance for disaster emergency response and all-weather monitoring. However, the significant differences in their imaging mechanisms lead to inconsistent feature distributions, which, coupled with the lack of annotated samples and textual descriptions, restrict the detection performance of traditional methods and existing deep learning models. To this end, this paper proposes a multi-dimensional change enhancement CLIP change detection network (MCE-CLIP), aiming to tackle the challenges of heterogeneous image change detection in flood disaster scenarios. The network constructs a cross-modal semantic guidance mechanism based on “SAR image transfer-text generation”, effectively narrowing the semantic gap between heterogeneous images. Meanwhile, a pseudo-siamese visual feature extraction branch and a multi-dimensional change feature enhancement module (MCFEM) are designed. By embedding modality adapters, the domain distribution discrepancy of remote sensing images is reduced. Furthermore, the MCFEM is constructed by integrating temporal cross-attention, multi-granularity differencing, and hybrid similarity projection, achieving the efficient integration of spatiotemporal contextual information. Experimental results on two typical heterogeneous datasets demonstrate that MCE-CLIP outperforms existing mainstream heterogeneous change detection methods in core evaluation metrics such as F1 score and intersection over union.