测绘学报 ›› 2016, Vol. 45 ›› Issue (11): 1335-1341.doi: 10.11947/j.AGCS.2016.20150371

• 地图学与地理信息 • 上一篇    下一篇

显著空间同位模式的多尺度挖掘方法

何占军, 刘启亮, 邓敏, 蔡建南   

  1. 中南大学地球科学与信息物理学院地理信息系, 湖南 长沙 410083
  • 收稿日期:2015-07-13 修回日期:2016-09-10 出版日期:2016-11-20 发布日期:2016-12-03
  • 通讯作者: 刘启亮 E-mail:qiliang.liu@csu.edu.cn
  • 作者简介:何占军(1987-),男,博士生,主要研究方向为时空关联模式挖掘方法及应用。E-mail:hezhanjun000@126.com
  • 基金资助:
    国家自然科学基金(41471385;41601410);湖南省自然科学杰出青年基金(14JJ1007);资源与环境信息系统国家重点实验室开放基金;湖南省科技计划项目(2015SK2078)

A Multi-scale Method for Mining Significant Spatial Co-location Patterns

HE Zhanjun, LIU Qiliang, DENG Min, CAI Jiannan   

  1. Department of Geo-Informatics, Central South University, Changsha 410083, China
  • Received:2015-07-13 Revised:2016-09-10 Online:2016-11-20 Published:2016-12-03
  • Supported by:
    The National Natural Science Foundation of China (Nos. 41471385, 41601410); The Hunan Provincial Science Fund for Distinguished Young Scholars (No.14JJ1007); State Key Laboratory of Resources and Environmental Information System; The Science and Technology Foundation of Hunan Province(No.2015SK2078)

摘要: 空间同位模式挖掘对于揭示地理现象间的共生、依赖规律具有重要价值。然而,空间同位模式挖掘中参数阈值缺乏先验知识,若设置不合理,挖掘结果中会遗漏重要的模式或包含冗余的、甚至错误的模式。为此,本文提出了一种基于模式重建的显著空间同位模式多尺度挖掘方法。首先,定义了互邻近距离指标,该指标可用来确定距离阈值的有效取值范围。进而,以模式重建为基础构建零模型,借助统计检验的方法来发现显著的空间同位模式,从而避免了兴趣度阈值的设置。最后,对空间同位模式进行多尺度挖掘,并引入生存期的概念对同位模式多尺度挖掘结果进行有效性评价。试验结果表明:本文方法可有效降低算法参数设置的主观性,从而提升空间同位模式挖掘结果的准确性和稳健性。

关键词: 数据挖掘, 空间同位, 统计显著模式, 模式重建, 多尺度

Abstract: Spatial co-location patterns discovery aims to detect spatial features whose instances are frequently located in geographic proximity. Such patterns can reveal unknown regularity in geographic phenomena and they are helpful for decision-making. However, due to the little prior knowledge, it is difficult to specify thresholds for neighbor distance and prevalence index.As a result, the outcomes of most algorithms always include insignificant or even erroneous patterns. A pattern-reconstruction-based approach was proposed to discover only significant co-location patterns. Firstly, we introduce a new definition of MNND, which can identify the lower and upper bounds of neighbor distance threshold. Then, a null model was constructed based on the pattern reconstruction. On basis of that, selection of prevalence threshold is replaced by hypothesis testing. Finally, significant colocation patterns were mined at multiple distances and the results were evaluated by the notion of lifetime. The experimental results show that our approach could avoid the subjectivity in determining those thresholds, thereby improving the correctness and robustness.

Key words: data mining, spatial co-location, statistical significant patterns, pattern reconstruction, multi-scale

中图分类号: