Acta Geodaetica et Cartographica Sinica ›› 2021, Vol. 50 ›› Issue (11): 1605-1616.doi: 10.11947/j.AGCS.2021.20210251

• Environment Perception for Intelligent Driving • Previous Articles     Next Articles

Research on key frame image processing of semantic SLAM based on deep learning

DENG Chen1, LI Hongwei2, ZHANG Bin2, XU Zhibin1, XIAO Zhiyuan3   

  1. 1. School of Information Engineering, Zhengzhou University, Zhengzhou 450052, China;
    2. School of Geoscience and Technology, Zhengzhou University, Zhengzhou 450052, China;
    3. School of Water Conservancy Science and Engineering, Zhengzhou University, Zhengzhou 450001, China
  • Received:2021-05-11 Revised:2021-11-06 Published:2021-12-07
  • Supported by:
    The Strategic Consulting Research Project of Henan Research Institute of China Engineering Science and Technology Development Strategy-Key Technologies of Intelligent Robot Spatial Perception (No. 2020HENZT07)

Abstract: Simultaneous localization and mapping (SLAM) has broad application prospects in many fields due to its high energy efficiency and low power consumption. However, there are still some problems in the traditional SLAM system: the key frame in the traditional visual odometer does not contain semantic information, the image information obtained by the mobile robot is relatively single, and the key frame in the actual scene always contains a large number of mismatched points and dynamic points. In response to the above problems, this paper proposes a new idea of semantic SLAM mapping technology. First, in order to find the correct and corresponding feature points, abandon the interference of dynamic points and mismatched points simultaneously, a method for judging the feature state of adjacent frames based on the Lucas-Kanade optical flow method is proposed, and this function is regarded as a new feature. The thread is added to the visual odometry part of ORB-SLAM3 to complete the optimization and improvement of part of the traditional SLAM framework. Secondly, in view of the problem that the image frame obtained by the front-end visual odometer of the traditional SLAM system does not contain any semantic information, the target detection algorithm based on YOLOV4 and the Mask R-CNN semantic segmentation algorithm fused with fully connected conditional random field CRF are used to compare ORB-SLAM3. The key frame image processing of the robot effectively improves the perception of the indoor environment of smart devices such as robots.

Key words: simultaneous localization and mapping, ORB-SLAM3, three-dimensional semantic map, deep learning, Mask R-CNN

CLC Number: