测绘学报

• •    

一种基于密度的自适应空间聚类算法

李光强1,邓敏2,刘启亮3,程涛3   

  1. 1. 中南大学 测绘与国土信息工程系
    2. 中南大学测绘与国土信息工程系
    3.
  • 收稿日期:2008-12-01 修回日期:2009-02-06 出版日期:2011-12-28 发布日期:2019-01-01
  • 通讯作者: 李光强

An Adaptive Density-Based Spatial Clustering Method

LI Guang-Qiang , , ,   

  • Received:2008-12-01 Revised:2009-02-06 Online:2011-12-28 Published:2019-01-01
  • Contact: LI Guang-Qiang

摘要: 现有空间聚类方法大都使用固定阈值进行聚类,空间点状目标近似均匀分布时,能够得出满意结果。但是现实世界里,空间数据分布存在一定的空间分异特性。当空间点目标分布不均匀时,由于现有聚类方法忽略了局部密度不相等的特性,从而造成聚类结果不理想,不能反映空间数据隐藏的分布模式。为此,本文研究了一种基于密度的自适应空间聚类算法 — ADBSC,提出了一种新的基于空间局部密度的度量方法,即 k-空间近邻最大距离。进而,引入了距离变化率度量指标,在k-空间邻近域中,使用距离变化率阈值判断空间点是否为核,并检测密度是否直达或可达,再将所有密度相连的核及其边界点聚为一类,从而实现了自动适应空间局部密度变化的空间聚类方法,并给出了ADBSC算法的详细描述。最后,对本文所提方法进行了实验分析和实例验证。其中利用模拟实验证明该算法能自动适应空间局部密度变化,揭示各种形状的空间簇;通过实验对比证实ADBSC算法比DBSCAN算法更具有适用性,且计算效率较高;通过实际数据进一步证明ADBSC算法具有很好的实用价值。

Abstract: Most spatial clustering methods utilize fixed thresholds in the process of clustering which assume homogeneous (or even) distribution of the spatial points rather than inhomogeneous (or uneven) scattering. However, spatial points usually distribute unevenly (in different density) which makes the fixed threshold methods inappropriate. Thus, an adaptive density-based spatial cluster algorithm, ADBSC for short, is developed in this paper. A new measurement of spatial local density, named as maximum distance in k-spatial nearest neighborhood (k-NN for short), is proposed. In k-NN a Dap (distance alternation proportion) threshold is used to judge whether a point is a core, and check whether density is directly reachable or reachable. All density-reachable cores and their boundary points compose a spatial cluster so that the spatial cluster is implemented, which adapts to changes of local density among spatial points. Furthermore, the ADBSC algorithm is described in detail. A simulation test demonstrates that the ADBSC algorithm is capable to discovery arbitrary-shape clusters and is robust for noises. A comparison between ADBSC and DBSCAN shows that the ADBSC is more flexible and efficient than the DBSCAN. Finally, the real-life data is employed to prove that the ADBSC has more practicality than DBSCAN.