考虑空间数据分布的复杂性与不连续性,提出了一种点目标聚类方法.算法利用全要素Voronoi图准确识别与表达点目标与线面实体的空间相关性;根据点目标位置分布特征计算面积阈值来控制聚类的粒度,同时以空间尺度变化下面积阈值的恒定作为判断尺度收敛的条件,实现点目标的多尺度划分,时间复杂度为O(n log n).经试验验证,聚类尺度随点目标分布特征自适应收敛,算法无须自定义参数,能够有效地发现受线面目标约束的任意形态点目标集群,对异常值处理稳健.
Considering the complexity and discontinuity of spatial data distribution, a clustering algorithm of points was proposed. To accurately identify and express the spatial correlation among points,lines and polygons, a Voronoi diagram that is generated by all spatial features is introduced. According to the distribution characteristics of point's position, an area threshold used to control clustering granularity was calculated. Meanwhile, judging scale convergence by constant area threshold, the algorithm classifies spatial features based on multi-scale, with an O(n log n) running time.Results indicate that spatial scale converges self-adaptively according with distribution of points.Without the custom parameters, the algorithm capable to discover arbitrary shape clusters which be bound by lines and polygons, and is robust for outliers.
[1] NG R T, HAN J W. Efficient and Effective Clustering Methods for Spatial Data Mining[C]//Proceedings of the 20th VLDB Conference. Santiago, Chile:[s.n.], 1994: 144-155.
[2] ESTIVILL-CASTROV, LEEI. AUTOCLUST+: Automatic Clustering of Point-data Sets in the Presence of Obstacles[C]//Proceedings of the 1st International Workshop on Temporal, Spatial and Spatio-temporal Data Mining. Berlin: Springer-Verlag, 2001: 133-146.
[3] LEUNG Y, ZHANGJ S, XUZ B. Clustering by Scale-space Filtering[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2000, 22(12): 1396-1410.
[4] TUNG A K H, HOU J, Han J W. Spatial Clustering in the Presence of Obstacles[C]//Proceedings of the 17th International Conference on Data Engineering. Heidelberg: IEEE, 2001: 359-367.
[5] ZAIANE O R, LEE C H. Clustering Spatial Data when Facing Physical Constraints[C]//Proceedings of the 2002 IEEE International Conference on Data Mining. Maebashi City, Japan:IEEE, 2002: 737-740.
[6] FAN B. A Hybrid Spatial Data Clustering Method for Site Selection: The Data Driven Approach of GIS Mining[J]. Expert Systems with Applications, 2009, 36(2): 3923-3936.
[7] YANG Yang, SUN Zhiwei, ZHAO Zheng. Density-based Spatial Clustering Method with Obstacle Constraints[J]. Computer Applications, 2007, 27(7): 1688-1691. (杨杨, 孙志伟, 赵政. 一种处理障碍约束的基于密度的空间聚类算法[J]. 计算机应用, 2007, 27(7): 1688-1691.)
[8] CAO Keyan, WANG Guoren, HAN Donghong, et al. Clustering Algorithm of Uncertain Data in Obstacle Space[J]. Journal of Frontiers of Computer Science and Technology, 2012, 6(12): 1087-1097. (曹科研, 王国仁, 韩东红, 等. 障碍空间中不确定数据聚类算法[J]. 计算机科学与探索, 2012, 6(12): 1087-1097.)
[9] WANG Min, ZHOU Chenghu, PEI Tao, et al. MSCMO: A Scale Space Clustering Algorithm Based on Mathematical Morphology Operators[J]. Journal of Remote Sensing, 2004, 8(1): 45-50.(汪闽, 周成虎, 裴韬, 等. MSCMO: 基于数学形态学算子的尺度空间聚类方法[J]. 遥感学报, 2004, 8(1): 45-50.)
[10] KONG Juan, XU Futian, XUE Qingfeng. Clustering Algorithm Based on Mathematical Morphology in the Presence of Obstacles[J]. Computer Knowledge and Technology, 2008, 4(9): 2923-2925. (孔娟, 徐夫田, 薛庆峰. 基于数学形态学的带障碍约束的空间聚类算法研究[J]. 电脑知识与技术, 2008, 4(9): 2923-2925.)
[11] ESTIVILL-CASTROV, LEEI. Clustering with Obstacles for Geographical Data Mining[J]. ISPRS Journal of Photogrammetry and Remote Sensing, 2004, 59(1-2): 21-34.
[12] SHI Yan, LIU Qiliang, DENG Min, et al. A Novel Spatial Clustering Method with Spatial Obstacles[J]. Geomatics and Information Science of Wuhan University, 2012, 37(1): 96-100. (石岩, 刘启亮, 邓敏, 等. 一种顾及障碍约束的空间聚类方法[J]. 武汉大学学报: 信息科学版, 2012, 37(1): 96-100.)
[13] LIUD Q, NOSOVSKIYG V, SOURINAO. Effective Clustering and Boundary Detection Algorithm Based on Delaunay Triangulation[J]. Pattern Recognition Letters, 2008, 29(9): 1261-1273.
[14] EDLA D R, JANA P K. A Novel Clustering Algorithm Using Voronoi Diagram[C]//Proceedings of the 7th International Conference on Digital Information Management. Macau: IEEE, 2012: 35-40.
[15] XUE Lixia, WANG Linlin, WANG Zuocheng, et al. Spatial Clustering in the Presence of Obstacles Based on Voronoi Diagram[J]. Computer Science, 2007, 34(2): 189-191. (薛丽霞, 汪林林, 王佐成, 等. 基于Voronoi图的有障碍物空间聚类[J]. 计算机科学, 2007, 34(2): 189-191.)
[16] LUO Jiancheng, YEE Leung, ZHOU Chenghu. Scale Space Based Hierarchical Clustering Method and Its Application to Remotely Sensed Data Classification[J]. Acta Geodaetica et Cartographica Sinica, 1999, 28(4): 319-324. (骆剑承, 梁怡, 周成虎. 基于尺度空间的分层聚类方法及其在遥感影像分类中的应用[J]. 测绘学报, 1999, 28(4): 319-324.)
[17] CHEN Xiaoyu. A Study of Economic Regionalization Based on Multiscale Spatial Clustering[J]. Journal of Chongqing Normal University:Natural Science, 2011, 28(5): 81-84, 92. (陈小瑜. 基于多尺度空间聚类的经济区域划分研究[J]. 重庆师范大学学报: 自然科学版, 2011, 28(5): 81-84, 92.)
[18] ESTIVILL-CASTROV, LEEI. Amoeba: Hierarchical Clustering Based on Spatial Proximity Using Delaunay Diagram[C]//Proceedings of the 9thInternational Symposium on Spatial Data Handling. [S.l.]: IGU Study Group on Geographical Information Science, 2000: 1-16.
[19] SHI Peibei, GUO Yutang, HU Yujuan, et al.Multiscale Spectral Clustering Algorithm[J]. Computer Engineering and Applications, 2011, 47(8): 128-130. (施培蓓, 郭玉堂, 胡玉娟, 等. 多尺度的谱聚类算法[J]. 计算机工程与应用, 2011, 47(8): 128-130.)
[20] FU Lihua, LI Hongwei, ZHANG Meng, et al. Multi-scale Radial Basis Function Networks with Multiple Kernels[J]. Journal of Huazhong University of Science & Technology:Nature Science Edition, 2010, 38(1): 39-42. (付丽华, 李宏伟, 张猛, 等. 带多个核函数的多尺度径向基函数网络[J]. 华中科技大学学报: 自然科学版, 2010, 38(1): 39-42.)
[21] CHEN Jun. Voronoi Dynamic Spatial Data Model[M]. Beijing: Surveying and Mapping Press, 2002. (陈军. Voronoi动态空间数据模型[M]. 北京: 测绘出版社, 2002.)
[22] GOLD C M. Dynamic Spatial Data Structures:The Voronoi Approach[C]//Proceedings of the 1992 Canadian Conference on GIS.Ottawa:CIG, 1992: 245-255.
[23] MACQUEEN J. Some Methods for Classification and Analysis of Multivariate Observations[C]//Proceedings of the 5th Berkeley Symposium on Mathematical Statistics and Probability. Berkley:[s.n.],1967: 281-297.
[24] HAN J W, KAMBER M. Data Mining: Concepts and Techniques[M]. Beijing: China Machine Press, 2001. (HAN J W, KAMBER M. 数据挖掘: 概念与技术[M]. 北京: 机械工业出版社, 2001.)
[25] ZHANGT, RAMAKRISHNAN R, LIVNY M. BIRCH: An Efficient Data Clustering Method for Very Large Databases[C]//Proceedings of the 1996 ACM SIGMOD International Conference on Management of Data.New York: ACM, 1996: 103-114.