Acta Geodaetica et Cartographica Sinica ›› 2024, Vol. 53 ›› Issue (6): 1057-1076.doi: 10.11947/j.AGCS.2024.20230259

• Smart Surveying and Mapping • Previous Articles     Next Articles

Research progress in the application of image semantic information in visual SLAM

Chi GUO1,2,3(), Yang LIU1, Yarong LUO2, Jingnan LIU2, Quan ZHANG2   

  1. 1.Hubei Luojia Laboratory, Wuhan University, Wuhan 430079, China
    2.Research Center of GNSS, Wuhan University, Wuhan 430079, China
    3.Artificial Intelligence Institute, Wuhan University, Wuhan 430079, China
  • Received:2023-09-08 Published:2024-07-22
  • About author:GUO Chi (1983—), male, PhD, professor, majors in the application of BeiDou technology, intelligent navigation of unmanned systems, and theoretical methods of location services. E-mail: guochi@whu.edu.cn
  • Supported by:
    The National Key Research and Development Program of China(2022YFB3903801);The Major Science and Technology Project of Hubei Province(2022AAA009);The Open Fund of Hubei Luojia Laboratory;The China Postdoctoral Science Foundation under Grant Number(2023TQ0248)

Abstract:

Visual simultaneous localization and mapping (VSLAM) technology uses cameras as the primary sensor to capture image data and obtain the position and orientation of the carrier based on algorithms such as multi-view geometry and state estimation, while simultaneously constructing a map for navigation and localization. VSLAM is a key technology in autonomous driving, AR, VR, MR, intelligent robotics, and drone flight control. In recent years, with the increasing demand for intelligent navigation and localization in various industries, VSLAM, which was originally focused on geometric measurements, has gradually integrated a semantic understanding of the environment. Semantic information refers to concepts that can be directly perceived and understood by humans, and semantic information in images refers to information such as object contours, categories, and saliency. Compared to geometric structures and features, image semantic information is more temporally and spatially consistent and provides results that are closer to human perception. Introducing image semantic information into visual SLAM can not only promote the performance of each module of the system, but also enhance the intelligent perception ability of VSLAM, forming a semantic VSLAM that integrates multiple functions such as geometric measurement, localization, and environment understanding. In this article, based on the application of image semantic information, we summarize the classic solutions and the latest research progress in semantic VSLAM. Based on this, we summarize the existing problems and challenges in visual semantic SLAM and propose future research directions in this field to further promote its development towards intelligent navigation and localization.

Key words: visual SLAM, visual semantic SLAM, deep learning, intelligent navigation and localization

CLC Number: