测绘学报 ›› 2017, Vol. 46 ›› Issue (10): 1534-1548.doi: 10.11947/j.AGCS.2017.20170275

• 地图学与地理信息 • 上一篇    下一篇

尺度驱动的空间聚类理论

李志林1,3, 刘启亮1,2, 唐建波2   

  1. 1. 香港理工大学土地测量与地理资讯学系, 香港 九龙;
    2. 中南大学地理信息系, 湖南 长沙 410083;
    3. 西南交通大学高铁运营安全空间信息技术国家地方联合实验室, 四川 成都 611756
  • 收稿日期:2017-05-26 修回日期:2017-09-04 出版日期:2017-10-20 发布日期:2017-10-26
  • 通讯作者: 刘启亮 E-mail:qiliang.liu@csu.edu.cn
  • 作者简介:李志林(1960-),男,教授,研究方向为地图学、地理信息理论及遥感信息提取等。E-mail:lszlli@polyu.edu.hk
  • 基金资助:
    国家自然科学基金(41601410;41471383);湖南省自然科学基金(2017JJ3379)

Towards a Scale-driven Theory for Spatial Clustering

LI Zhilin1,3, LIU Qiliang1,2, TANG Jianbo2   

  1. 1. Department of Land Surveying and Geo-Informatics, The Hong Kong Polytechnic University, Hong Kong, China;
    2. Department of Geo-Informatics, Central South University, Changsha 410083, China;
    3. State-Provincial Joint Engineering Laboratory of Spatial Information Technology for High-speed Railway Safety, Southwest Jiaotong University, Chengdu 611756, China
  • Received:2017-05-26 Revised:2017-09-04 Online:2017-10-20 Published:2017-10-26
  • Supported by:
    The National Natural Science Foundation of China (Nos. 41601410;41471383);The Natural Science Foundation of Hunan Province (No. 2017JJ3379)

摘要: 空间聚类是探索性空间数据分析的有力手段,不仅可以直接用于发现地理现象的分布格局与分布特征,亦可以为其他空间数据分析任务提供重要的预处理步骤。空间聚类有望成为大数据认知的突破口。空间聚类研究虽然已经引起了广泛关注,但是依然面临两大最根本的困境:“无中生有”和“无从理解”。“无中生有”指的是:绝大多数方法,即使针对不包含聚类结构的数据集,仍然会发现聚类;“无从理解”指的是:即使同一种聚类方法,采用不同的聚类参数就会获得千变万化的聚类结果,而这些结果的含义不明确。造成上述困境的根本原因在于:尺度没有在聚类模型中被当作重要参数而恰当地体现。为此,笔者受到人类视觉多尺度认知原理的启发,根据多尺度表达的“自然法则”,建立了一套尺度驱动的空间聚类理论。首先将尺度定量化建模为聚类模型的参数,然后将空间聚类的尺度依赖性建模为一种假设检验问题,最后通过控制尺度参数以自动获得统计显著的多尺度聚类结果。在该理论指导下,可以构建适用不同应用需求的多尺度空间聚类模型,一方面降低了空间聚类过程中的主观性,另一方面有利于对空间聚类模式进行全面而深入的分析。

关键词: 空间聚类, 尺度, 自然法则, 视觉认知, 假设检验

Abstract: Spatial clustering plays a key role in exploratory geographical data analysis. It is important for investigating the distribution of geographical phenomena. Spatial clustering sometimes also serves as an important pre-processing for other geographical data analysis techniques. Although lots of attentions have been paid to spatial clustering, two serious obstacles remain to be tackled:①clusters will always be discovered in any geographical dataset by spatial clustering algorithms, even if the input dataset is a random dataset; ②users feel difficult to interpret the various clustering results obtained by using different parameters. It is hypothesized that scale is not handled well in clustering process. As a result, a scale-driven theory for spatial clustering is introduced in this study, based on the human recognition theory and the natural principle of multi-scale representation. Scale is modeled as parameter of a clustering model, and the scale dependency in spatial clustering is handled by constructing a hypothesis testing, and multi-scale significant clusters can be easily discovered by controlling the scale parameters in an objective manner.

Key words: spatial clustering, scale, natural principle, visual cognition, hypothesis testing

中图分类号: