测绘学报 ›› 2023, Vol. 52 ›› Issue (5): 843-851.doi: 10.11947/j.AGCS.2023.20220330

• 地图学与地理信息 • 上一篇    下一篇

地理要素类别语义相似度动态加权计算方法

谭永滨1,2,3, 高玲玲3, 李霖4,5, 程朋根1,2,3, 王红6, 李小龙1,2,3, 陈诚3   

  1. 1. 东华理工大学自然资源部环鄱阳湖区域矿山环境监测与治理重点实验室, 江西 南昌 330013;
    2. 东华理工大学中核三维地理信息工程技术研究中心, 江西 南昌 330013;
    3. 东华理工大学测绘工程学院, 江西 南昌 330013;
    4. 武汉大学资源与环境科学学院, 湖北 武汉 430072;
    5. 武汉大学地球空间信息技术协同创新中心, 湖北 武汉 430072;
    6. 湖北大学资源环境学院, 湖北 武汉 430062
  • 收稿日期:2022-05-16 修回日期:2023-01-30 发布日期:2023-05-27
  • 通讯作者: 程朋根 E-mail:pgcheng1964@163.com
  • 作者简介:谭永滨(1985-),男,博士,副教授,研究方向为地理知识语义分析。E-mail:tyb@ecut.edu.cn
  • 基金资助:
    自然资源部环鄱阳湖区域矿山环境监测与治理重点实验室开放基金(MEMI-2021-2022-24);国家自然科学基金(41861052;42261078)

A dynamic weighted model for semantic similarity measurement between geographic feature categories

TAN Yongbin1,2,3, GAO Lingling3, LI Lin4,5, CHENG Penggen1,2,3, WANG Hong6, LI Xiaolong1,2,3, CHEN Cheng3   

  1. 1. Key Laboratory of Mine Environmental Monitoring and Improving around Poyang Lake, Ministry of Natural Resources, East China University of Technology, Nanchang 330013, China;
    2. CNNC Engineering Research Center of 3D Geographic Information, East China University of Technology, Nanchang 330013, China;
    3. Faculty of Geomatics, East China University of Technology, Nanchang 330013, China;
    4. School of Resources and Environmental Science, Wuhan University, Wuhan 430072, China;
    5. Collaborative Innovation Center of Geospatial Technology, Wuhan University, Wuhan 430072, China;
    6. School of Resource and Environmental Science, Hubei University, Wuhan 430062, China
  • Received:2022-05-16 Revised:2023-01-30 Published:2023-05-27
  • Supported by:
    Open Fund of Key Laboratory of Mine Environmental Monitoring and Improving around Poyang Lake, Ministry of Natural Resources(No. MEMI-2021-2022-24);The National Natural Science Foundation of China (Nos. 41861052;42261078)

摘要: 语义相似度计算是解决地理要素类别语义异质问题的关键技术,在地理数据共享与交换应用中起着重要的作用。本文面向基础地理信息领域,针对相同特征属性在不同地理要素类别语义中存在重要性差异的特点,提出了一种基于动态权重的地理要素类别语义相似度算法。本文算法引入词频-逆向文件频率,利用属性值的特殊性,计算特征属性的动态权重,根据不同类型的特征属性提出相应的相似度算法,重点分析多值复杂型属性值的分解及相似度算法,得到地理要素类别间的相似度。最后,从基础地理要素类别中选择200组样本对计算语义相似度,并与其他4种相似度算法进行对比分析。试验结果表明,本文算法能够有效地反映特征属性的重要性差异,获得更准确合理的地理要素类别的语义相似度。

关键词: 语义相似度, 词频-逆向文件频率, 动态权重, 地理要素类别

Abstract: Semantic similarity is a key technology to solve the problem of semantic heterogeneity of geographic feature categories, and plays an important role in geographic data sharing and exchange applications. In this article, a semantic similarity calculation model of geographic feature categories based on dynamic weights is proposed to represent the difference in importance of semantic properties among geographic feature categories for the need of fundamental geographical domain application. TF-IDF algorithm is introduced and the particularity of the property value is used to calculate the dynamic weight of a semantic property. And the similarity model between a pair of complex properties is proposed, and then the final similarity between geographic feature categories is calculated. Finally, 200 pairs of samples are selected from the basic geographic element categories to calculate the semantic similarity, and compared with other similarity calculation models. The experimental results show that the model proposed in this article can effectively reflect the importance difference of semantic properties and obtain more reasonable semantic similarity between geographic feature categories.

Key words: semantic similarity, TF-IDF, dynamic weight, geographic feature category

中图分类号: