测绘学报 ›› 2021, Vol. 50 ›› Issue (11): 1605-1616.doi: 10.11947/j.AGCS.2021.20210251

• 智能驾驶环境感知 • 上一篇    下一篇

基于深度学习的语义SLAM关键帧图像处理

邓晨1, 李宏伟2, 张斌2, 许智宾1, 肖志远3   

  1. 1. 郑州大学信息工程学院, 河南 郑州 450052;
    2. 郑州大学地球科学与技术学院, 河南 郑州 450052;
    3. 郑州大学水利科学与工程学院, 河南 郑州 450001
  • 收稿日期:2021-05-11 修回日期:2021-11-06 发布日期:2021-12-07
  • 通讯作者: 李宏伟 E-mail:laob_811@sina.com
  • 作者简介:邓晨(1995—),男,硕士,研究方向为软件工程、计算机视觉。
  • 基金资助:
    中国工程科技发展战略河南研究院战略咨询研究项目(2020HENZT07)

Research on key frame image processing of semantic SLAM based on deep learning

DENG Chen1, LI Hongwei2, ZHANG Bin2, XU Zhibin1, XIAO Zhiyuan3   

  1. 1. School of Information Engineering, Zhengzhou University, Zhengzhou 450052, China;
    2. School of Geoscience and Technology, Zhengzhou University, Zhengzhou 450052, China;
    3. School of Water Conservancy Science and Engineering, Zhengzhou University, Zhengzhou 450001, China
  • Received:2021-05-11 Revised:2021-11-06 Published:2021-12-07
  • Supported by:
    The Strategic Consulting Research Project of Henan Research Institute of China Engineering Science and Technology Development Strategy-Key Technologies of Intelligent Robot Spatial Perception (No. 2020HENZT07)

摘要: 同时定位和地图构建(SLAM)凭借其高能效和低功耗等特点在诸多领域应用前景广阔。然而,在传统的SLAM系统中仍存在一些问题:传统的视觉里程计中关键帧并不包含语义信息,移动机器人获取的图像信息较为单一,且在实际场景中关键帧总包含大量误匹配点和动态点。针对以上问题,本文提出一种语义SLAM思路。首先,为了能够匹配到正确且对应的特征点,摒弃动态点和误匹配点的干扰,提出了一种基于Lucas-Kanade光流法的相邻帧特征状态判别法,将这项功能作为新的线程加入ORB-SLAM3的视觉里程计部分,完成对部分传统SLAM框架的优化和改进工作。其次,针对传统SLAM系统前端视觉里程计获取的图像帧不包含任何语义信息的问题,使用基于YOLOV4的目标检测算法和融合全连接条件随机场CRF的Mask R-CNN语义分割算法对ORB-SLAM3中的关键帧图像进行处理,有效提高了机器人等智能设备对室内环境的感知能力。

关键词: 同时定位与建图, ORB-SLAM3, 三维语义地图, 深度学习, Mask R-CNN

Abstract: Simultaneous localization and mapping (SLAM) has broad application prospects in many fields due to its high energy efficiency and low power consumption. However, there are still some problems in the traditional SLAM system: the key frame in the traditional visual odometer does not contain semantic information, the image information obtained by the mobile robot is relatively single, and the key frame in the actual scene always contains a large number of mismatched points and dynamic points. In response to the above problems, this paper proposes a new idea of semantic SLAM mapping technology. First, in order to find the correct and corresponding feature points, abandon the interference of dynamic points and mismatched points simultaneously, a method for judging the feature state of adjacent frames based on the Lucas-Kanade optical flow method is proposed, and this function is regarded as a new feature. The thread is added to the visual odometry part of ORB-SLAM3 to complete the optimization and improvement of part of the traditional SLAM framework. Secondly, in view of the problem that the image frame obtained by the front-end visual odometer of the traditional SLAM system does not contain any semantic information, the target detection algorithm based on YOLOV4 and the Mask R-CNN semantic segmentation algorithm fused with fully connected conditional random field CRF are used to compare ORB-SLAM3. The key frame image processing of the robot effectively improves the perception of the indoor environment of smart devices such as robots.

Key words: simultaneous localization and mapping, ORB-SLAM3, three-dimensional semantic map, deep learning, Mask R-CNN

中图分类号: