Acta Geodaetica et Cartographica Sinica ›› 2025, Vol. 54 ›› Issue (8): 1489-1500.doi: 10.11947/j.AGCS.2025.20240254
• Photogrammetry and Remote Sensing • Previous Articles Next Articles
Wenjian GAN(
), Yang ZHOU(
), Xiaofei HU, Luying ZHAO, Gaoshuang HUANG, Mingbo HOU
Received:2024-06-24
Revised:2025-07-04
Published:2025-09-16
Contact:
Yang ZHOU
E-mail:14737117985@163.com;zhouyang3d@163.com
About author:GAN Wenjian (2000—), male, postgraduate, majors in image geo-localization. E-mail: 14737117985@163.com
CLC Number:
Wenjian GAN, Yang ZHOU, Xiaofei HU, Luying ZHAO, Gaoshuang HUANG, Mingbo HOU. Combining projective transform and road segmentation for street view-satellite images cross-view geo-localization[J]. Acta Geodaetica et Cartographica Sinica, 2025, 54(8): 1489-1500.
Tab. 1
Recall before and after adding auxiliary training branch"
| 方法 | CVACT | CVWU | ||||||
|---|---|---|---|---|---|---|---|---|
| R@1 | R@5 | R@10 | R@1% | R@1 | R@5 | R@10 | R@1% | |
| SAFA | 81.25 | 92.46 | 94.42 | 97.75 | 75.23 | 88.51 | 91.76 | 95.05 |
| SAFA | 81.83 | 92.72 | 94.27 | 97.84 | 77.49 | 89.73 | 92.20 | 95.62 |
| GeoDTR | 83.27 | 93.30 | 94.84 | 97.83 | 73.76 | 89.01 | 92.67 | 96.84 |
| GeoDTR | 83.91 | 93.45 | 94.78 | 97.72 | 78.12 | 90.67 | 93.17 | 96.52 |
| SAIG | 84.24 | 93.94 | 95.57 | 98.32 | 84.03 | 95.55 | 97.24 | 99.25 |
| SAIG | 84.67 | 93.81 | 95.49 | 98.30 | 87.79 | 95.93 | 97.59 | 99.15 |
| TransGeo | 86.63 | 95.37 | 96.56 | 98.49 | 94.18 | 98.87 | 99.47 | 99.81 |
| TransGeo | 87.19 | 95.44 | 96.45 | 98.59 | 95.15 | 99.00 | 99.47 | 99.81 |
Tab. 3
Effect of scale factor λ on Recall@1 accuracy"
| 方法 | FLOPs | Params/MB | CVACT | CVWU | ||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 0.1 | 0.2 | 0.3 | 0.4 | 0.5 | 0 | 0.1 | 0.2 | 0.3 | 0.4 | 0.5 | |||
| SAFA | 4.15×109 | 2.78×106 | 81.25 | 81.19 | 81.54 | 81.66 | 81.83 | 81.50 | 75.23 | 76.86 | 75.49 | 75.89 | 77.49 | 76.30 |
| GeoDTR | 1.089×1010 | 1.235×107 | 83.27 | 83.18 | 83.45 | 83.91 | 83.42 | 83.42 | 73.76 | 75.08 | 77.11 | 78.12 | 75.42 | 75.20 |
| SAIG | 1.245×1010 | 1.552×107 | 84.24 | 84.50 | 84.38 | 84.67 | 84.39 | 84.19 | 84.03 | 85.13 | 86.88 | 86.76 | 86.76 | 87.79 |
| TransGeo | 1.247×1010 | 2.236×107 | 86.63 | 86.70 | 87.19 | 86.72 | 86.52 | 85.99 | 94.18 | 94.83 | 94.90 | 95.12 | 95.15 | 95.08 |
| [1] | ZHANG Xiwu, WANG Lei, SU Yan. Visual place recognition: a survey from deep learning perspective[J]. Pattern Recognition, 2021, 113: 107760. |
| [2] | MASONE C, CAPUTO B. A survey on deep visual place recognition[J]. IEEE Access, 2021, 9: 19516-19547. |
| [3] | DOAN D, LATIF Y, CHIN T J, et al. Scalable place recognition under appearance change for autonomous driving[C]//Proceedings of 2019 IEEE/CVF International Conference on Computer Vision. Seoul: IEEE, 2019: 9318-9327. |
| [4] | 黄高爽, 周杨, 胡校飞, 等. 图像地理定位研究进展[J]. 地球信息科学学报, 2023, 25(7): 1336-1362. |
| HUANG Gaoshuang, ZHOU Yang, HU Xiaofei, et al. A survey of the research progress in image geo-localization[J]. Journal of Geo-information Science, 2023, 25(7): 1336-1362. | |
| [5] |
杨元喜, 王建荣. 泛在感知与航天测绘[J]. 测绘学报, 2023, 52(1): 1-7. DOI: .
doi: 10.11947/j.AGCS.2023.20220405 |
|
YANG Yuanxi, WANG Jianrong. Ubiquitous perception and space mapping[J]. Acta Geodaetica et Cartographica Sinica, 2023, 52(1): 1-7. DOI: .
doi: 10.11947/j.AGCS.2023.20220405 |
|
| [6] | LIN T Y, BELONGIE S, HAYS J. Cross-view image geolocalization[C]//Proceedings of 2013 IEEE Conference on Computer Vision and Pattern Recognition. Portland: IEEE, 2013: 891-898. |
| [7] | VISWANATHAN A, PIRES B R, HUBER D. Vision based robot localization by ground to satellite matching in GPS-denied situations[C]//Proceedings of 2014 IEEE/RSJ International Conference on Intelligent Robots and Systems. Chicago: IEEE, 2014: 192-198. |
| [8] | DOSOVITSKIY A, BEYER L, KOLESNIKOV A, et al. An image is worth 16×16 words: Transformers for image recognition at scale[EB/OL]. [2021-06-03]. https://arxiv.org/abs/2010.11929 |
| [9] | SHI Yujiao, LIU Liu, YU Xin, et al. Spatial-aware feature aggregation for image based cross-view geo-localization[C]//Proceedings of the 33rd International Conference on Neural Information Processing Systems. Vancouver: ACM Press, 2019: 10090-10100. |
| [10] | LI Songlian, TU Zhigang, CHEN Yujin, et al. Multi-scale attention encoder for street-to-aerial image geo-localization[J]. CAAI Transactions on Intelligence Technology, 2023, 8(1): 166-176. |
| [11] | 饶子昱, 卢俊, 郭海涛, 等. 利用视角转换的跨视角影像匹配方法[J]. 地球信息科学学报, 2023, 25(2): 368-379. |
| RAO Ziyu, LU Jun, GUO Haitao, et al. A cross-view image matching method with viewpoint conversion[J]. Journal of Geo-information Science, 2023, 25(2): 368-379. | |
| [12] | SHI Yujiao, YU Xin, LIU Liu, et al. Accurate 3-DoF camera geo-localization via ground-to-satellite image matching[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2023, 45(3): 2682-2697. |
| [13] | GOODFELLOW I, ABADIE J, MIRZA M, et al. Conditional generative adversarial nets[C]//Proceedings of the 27th International Conference on Neural Information Processing Systems. Montréal: ACM Press, 2014: 2672-2680. |
| [14] | REGMI K, SHAH M. Bridging the domain gap for ground-to-aerial image matching[C]//Proceedings of 2019 IEEE/CVF International Conference on Computer Vision. Seoul: IEEE, 2019: 470-479. |
| [15] | 何思瑾. 基于深度学习的跨视角图像地理定位技术研究[D]. 武汉: 华中科技大学, 2021. |
| HE Sijin. Research on cross-view image geo-localization technology based on deep learning[D]. Wuhan: Huazhong University of Science and Technology, 2021. | |
| [16] | TOKER A, ZHOU Qunjie, MAXIMOV M, et al. Coming down to earth: satellite-to-street view synthesis for geo-localization[C]//Proceedings of 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Nashville: IEEE, 2021: 6484-6493. |
| [17] | SHI Yujiao, CAMPBELL D, YU Xin, et al. Geometry-guided street-view panorama synthesis from satellite imagery[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2022, 44(12): 10009-10022. |
| [18] | HU Sixing, FENG Mengdan, NGUYEN R M H, et al. CVM-net: cross-view matching network for image-based ground-to-aerial geo-localization[C]//Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City: IEEE, 2018: 7258-7267. |
| [19] | SHI Yujiao, YU Xin, LIU Liu, et al. Optimal feature transport for cross-view image geo-localization[J]. Proceedings of the AAAI Conference on Artificial Intelligence, 2020, 34(7): 11990-11997. |
| [20] | SUN Bin, CHEN Chen, ZHU Yingying, et al. GEOCAPSNET: ground to aerial view image geo-localization using capsule network[C]//Proceedings of 2019 IEEE International Conference on Multimedia and Expo. Shanghai: IEEE, 2019: 742-747. |
| [21] | ZHANG Xiaohan, LI Xingyu, SULTANI W, et al. Cross-view geo-localization via learning disentangled geometric layout correspondence[J]. Proceedings of the AAAI Conference on Artificial Intelligence, 2023, 37(3): 3480-3488. |
| [22] | ZHU Sijie, SHAH M, CHEN Chen. TransGeo: transformer is all you need for cross-view image geo-localization[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. New Orleans: IEEE, 2022: 1152-1161. |
| [23] | ZHU Yingying, YANG Hongji, LU Yuxin, et al. Simple, effective and general: a new backbone for cross-view image geo-localization[EB/OL]. [2023-02-03]. https://arxiv.org/abs/2302.01572. |
| [24] | VON RUEDEN L, MAYER S, BECKH K, et al. Informed machine learning-a taxonomy and survey of integrating prior knowledge into learning systems[J]. IEEE Transactions on Knowledge and Data Engineering, 2023, 35(1): 614-633. |
| [25] | CHEN Xinlei, HE Kaiming. Exploring simple Siamese representation learning[C]//Proceedings of 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Nashville: IEEE, 2021: 15745-15753. |
| [26] | JEAN G, FLORIAN S, FLORENT A, et al. Bootstrap your own latent a new approach to self-supervised learning[C]//Proceedings of the 34th International Conference on Neural Information Processing Systems. Online: ACM, 2020: 21271-21284. |
| [27] | XIE Zhenda, ZHANG Zheng, CAO Yue, et al. SimMIM: a simple framework for masked image modeling[C]//Proceedings of 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New Orleans: IEEE, 2022: 9643-9653. |
| [28] | OQUAB M, DARCET T, MOUTAKANNI T, et al. DINOv2: learning robust visual features without supervision[EB/OL]. [2024-02-02]. https://arxiv.org/abs/2304.07193. |
| [29] | SCHROFF F, KALENICHENKO D, PHILBIN J. FaceNet: a unified embedding for face recognition and clustering[C]//Proceedings of 2015 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Boston: IEEE, 2015: 815-823. |
| [30] | HU Sixing, LEE G H. Image-based geo-localization using satellite imagery[J]. International Journal of Computer Vision, 2020, 128(5): 1205-1219. |
| [31] | HE Kaiming, FAN Haoqi, WU Yuxin, et al. Momentum contrast for unsupervised visual representation learning[C]//Proceedings of 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle: IEEE, 2020: 9726-9735. |
| [32] | MINAEE S, BOYKOV Y Y, PORIKLI F, et al. Image segmentation using deep learning: a survey[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2022, 44(7): 3523-3542. |
| [33] | ZOU Zhengxia, CHEN Keyan, SHI Zhenwei, et al. Object detection in 20 years: a survey[J]. Proceedings of the IEEE, 2023, 111(3): 257-276. |
| [34] | CHENG Bowen, MISRA I, SCHWING A G, et al. Masked-attention mask Transformer for universal image segmentation[C]//Proceedings of 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New Orleans: IEEE, 2022: 1280-1289. |
| [35] | LIU Liu, LI Hongdong. Lending orientation to neural networks for cross-view geo-localization[C]//Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach: IEEE, 2019: 5617-5626. |
| [1] | Qingdong WANG, Tengfei WANG, Li ZHANG. Cross-modal contrastive masked autoencoder pre-training for 3D real-scene point cloud [J]. Acta Geodaetica et Cartographica Sinica, 2025, 54(4): 675-687. |
| [2] | Jialing LI, Ji QI, Weipeng LU, Chao TAO. Self-supervised learning based urban functional zone classification by integrating optical remote sensing image-OSM data [J]. Acta Geodaetica et Cartographica Sinica, 2025, 54(1): 154-164. |
| [3] | Zhanlong CHEN, Xiechun LU, Yongyang XU. A building aggregation method based on deep clustering of graph vertices [J]. Acta Geodaetica et Cartographica Sinica, 2024, 53(4): 736-749. |
| [4] | XUE Zhixiang, YU Xuchu, LIU Jingzheng, YANG Guopeng, LIU Bing, YU Anzhu, ZHOU Jianan, JIN Shanghong. A self-supervised pre-training scheme for multi-source heterogeneous remote sensing image land cover classification [J]. Acta Geodaetica et Cartographica Sinica, 2024, 53(3): 512-525. |
| [5] | Qin YAN, Haiyan GU, Yi YANG, Haitao LI, Hengtong SHEN, Shiqi LIU. Research progress and trend of intelligent remote sensing large model [J]. Acta Geodaetica et Cartographica Sinica, 2024, 53(10): 1967-1980. |
| [6] | YU Xuchu, LIU Bing, XUE Zhixiang. Potential analysis and prospect of hyperspectral ground object recognition [J]. Acta Geodaetica et Cartographica Sinica, 2023, 52(7): 1115-1125. |
| [7] | TAO Chao, YIN Ziwei, ZHU Qing, LI Haifeng. Remote sensing image intelligent interpretation: from supervised learning to self-supervised learning [J]. Acta Geodaetica et Cartographica Sinica, 2021, 50(8): 1122-1134. |
| Viewed | ||||||
|
Full text |
|
|||||
|
Abstract |
|
|||||