测绘学报 ›› 2023, Vol. 52 ›› Issue (10): 1703-1713.doi: 10.11947/j.AGCS.2023.20220466

• 摄影测量学与遥感 • 上一篇    下一篇

复杂城市动态图卷积网络三维场景语义分割法

张荣庭1, 张广运1, 尹继豪2   

  1. 1. 南京工业大学, 江苏 南京 211816;
    2. 北京航空航天大学, 北京 100191
  • 收稿日期:2022-07-21 修回日期:2022-11-27 发布日期:2023-10-31
  • 通讯作者: 张广运 E-mail:gyzhang1234@163.com
  • 作者简介:张荣庭(1989-),男,博士,讲师,研究方向为遥感信息智能处理。E-mail:zrt@njtech.edu.cn
  • 基金资助:
    国家自然科学基金(41601365;41871240)

Semantic segmentation method of 3D scenes using dynamic graph CNN for complex city

ZHANG Rongting1, ZHANG Guangyun1, YIN Jihao2   

  1. 1. Nanjing Tech University, Nanjing 211816, China;
    2. Beihang University, Beijing 100191, China
  • Received:2022-07-21 Revised:2022-11-27 Published:2023-10-31
  • Supported by:
    The National Natural Science Foundation of China (Nos. 41601365;41871240)

摘要: 在摄影测量与遥感领域,三维网格数据是最终用户产品之一,已广泛应用于城市规划、导航等任务中。但针对以三维网格表示的复杂城市场景的智能化语义分割的研究较少。为此,本文提出复杂城市动态图卷积网络三维场景语义分割方法(3Dcity-net)。利用三维网格固有的三维空间坐标信息和纹理信息构建的复合特征向量来表示三维网格中的三角面片。为降低纹理信息中噪声和冗余信息对语义分割结果的影响,提出在3Dcity-net网络结构中嵌入主成分分析模块。为缓解样本数据不平衡引起的语义分割精度下降的问题,采用焦点损失函数替代交叉熵损失函数。利用Hessigheim三维网格数据进行了语义分割试验。试验结果表明,3Dcity-net能够获得具有竞争力的三维网格语义分割结果,其中总体精度OA、Kappa系数、平均准确率mP、平均召回率mR、平均F1值(F1 score)和平均交并比mIoU分别为81.5%、0.776、73.0%、58.4%、62.6%和49.8%。与先进方法相比,本文方法总体精度OA分别提高了0.9%和8.3%。

关键词: 实景三维, 语义分割, 图卷积网络, 三维表示, 三维网格

Abstract: In photogrammetry and remote sensing community, 3D mesh is one of the final user products, which is widely applied in urban planning, navigation, etc. However, there are few works on semantic complex 3D mesh urban scene segmentation based on deep learning methods. Thus, a semantic segmentation method of 3D scenes using dynamic graph CNN for complex city (3Dcity-net) is proposed. By using mesh-inherent features containing 3D spatial information and texture information, a composite feature vector is proposed to represent each face in 3D mesh. To reduce the influence on semantic segmentation by the noise and redundant information in texture information, a principal component analysis (PCA) module is embedded in to the proposed 3D city-net. In order to alleviate the problem of semantic segmentation precision decrease caused by the unbalanced sample data, the focal loss function is used to replace the cross-entropy loss function. The Hessigheim 3D mesh data are utilized to perform experiments. The results of experiments show that the proposed method can obtain competitive semantic segmentation results on 3D mesh. The overall accuracy, Kappa coefficient, mean precision, mean recall, mean F1 score, and mean IoU is 81.5%, 0.776, 73.0%, 58.4%, 62.6%, and 49.8%, respectively. Comparing to two state-of-the-art methods, the overall accuracy increases by 0.9%, and 8.3%, respectively.

Key words: 3D real scene, semantic segmentation, graph CNN network, 3D representation, 3D mesh

中图分类号: