测绘学报 ›› 2018, Vol. 47 ›› Issue (9): 1250-1260.doi: 10.11947/j.AGCS.2018.20170321

• 地图学与地理信息 • 上一篇    下一篇

一种空间交叉异常显著性判别的非参数检验方法

杨学习, 邓敏, 石岩, 唐建波, 刘启亮   

  1. 中南大学地球科学与信息物理学院, 湖南 长沙 410083
  • 收稿日期:2017-06-19 修回日期:2018-05-10 出版日期:2018-09-20 发布日期:2018-09-26
  • 通讯作者: 邓敏 E-mail:dengmin208@tom.com
  • 作者简介:杨学习(1989-),男,博士生,研究方向为地理时空异常探测。E-mail:studyang@sina.cn
  • 基金资助:
    国家自然科学基金(41471385;41730105);国家重点研发计划(2016YFB0502303);中南大学中央高校基本科研业务费专项资金(2016zzts085)

A Nonparametric Test Method for Identifying Significant Cross-outliers in Spatial Point Dataset

YANG Xuexi, DENG Min, SHI Yan, TANG Jianbo, LIU Qiliang   

  1. School of Geosciences and Info-Physics, Central South University, Changsha 410083, China
  • Received:2017-06-19 Revised:2018-05-10 Online:2018-09-20 Published:2018-09-26
  • Supported by:
    The National Natural Science Foundation of China (Nos. 41471385;41730105);The National Key Research and Development Program of China (No. 2016YFB0502303);The Fundamental Research Funds for the Central Universities of Central South University (No. 2016zzts085)

摘要: 空间异常探测旨在从海量空间数据中挖掘不符合普适性规律、表现出“与众不同”特性的空间实体集合,对于揭示地理现象的特殊发展规律具有重要价值。现有研究在空间异常度量方面取得了重要进展,但多缺乏对空间异常模式显著性的统计判别,且是针对单一类别数据,没有顾及多类别数据间的相互影响。为此,本文基于空间随机过程的思想,针对两种类别空间点数据,提出了一种空间交叉异常显著性判别的非参数检验方法。首先,针对基本数据集实体,采用约束Delaunay三角网,构建合理、稳定的空间邻近域;然后,统计落在基本数据集实体空间参考邻域半径范围内的参考数据集实体的数目,度量初始异常度;进而,采用α-Shape法构建支撑域,以空间随机过程为基础构建零模型,采用蒙特卡洛模拟检验空间异常的显著性;最后,采用生存距离对异常模式的稳定性进行评价分析。通过试验分析与比较发现,该方法能够有效识别具有统计显著性的空间交叉异常。

关键词: 空间数据挖掘, 空间异常探测, 交叉异常, 非参数检验, 显著性

Abstract: In the field of geography,a spatial outlier is an object whose non-spatial attribute value is significantly different from the values of its spatial neighbors. Detection of spatial outliers will be helpful to uncover special geographical phenomenon,so it has become an important branch of spatial data mining.Although existing methods are able to measure spatial outlier factor,the significance of these outliers can not be evaluated in an objective way. Furthermore,the existing methods are mainly designed for single class dataset,without taking into account the interaction between different categories of dataset.In this study,a nonparametric test was developed to identify the significant cross-outliers in spatial point dataset.Firstly,a reasonable and stable spatial neighborhood is constructed for the primary dataset entitys using the constraint Delaunay triangulation.Then,using the number of reference dataset entitys falling in the spatial reference neighbor radius to measure the initial outlier factor.Constructed the support domain by α-Shape method,the null model is constructed based on spatial randomness process,and the significant spatial cross-outliers are identified by statistical test.Finally,the stability of the spatial cross-outlliers are evaluated by the living distance.Experimentson on both simulated and real-world datasets show that the proposed permutation test is effective for determining significant spatial cross-outliers in spatial point datasets.

Key words: spatial data mining, spatial outlier detection, cross-outlier, nonparametric test, significance

中图分类号: