安计勇, 韩海英, 侯效礼. 一种改进的DBscan聚类算法[J]. 微电子学与计算机, 2015, 32(7): 68-71. DOI: 10.19304/j.cnki.issn1000-7180.2015.07.016
引用本文: 安计勇, 韩海英, 侯效礼. 一种改进的DBscan聚类算法[J]. 微电子学与计算机, 2015, 32(7): 68-71. DOI: 10.19304/j.cnki.issn1000-7180.2015.07.016
JI Yong-an, HAN Hai-ying, HOU Xiao-li. An Improved DBscan Clustering Algorithm[J]. Microelectronics & Computer, 2015, 32(7): 68-71. DOI: 10.19304/j.cnki.issn1000-7180.2015.07.016
Citation: JI Yong-an, HAN Hai-ying, HOU Xiao-li. An Improved DBscan Clustering Algorithm[J]. Microelectronics & Computer, 2015, 32(7): 68-71. DOI: 10.19304/j.cnki.issn1000-7180.2015.07.016

一种改进的DBscan聚类算法

An Improved DBscan Clustering Algorithm

  • 摘要: 提出一种改进的DBscan聚类算法.该算法的改进基于以下两点:(1)针对DBscan算法核心点随机选取导致计算量大的缺点,提出选取距离最远且在ε距离内点的个数大于Minpts的点为核心点的方法;(2)针对DBscan算法由于ε和Minpts参数全局唯一性导致聚类质量差的缺点,提出二次聚类的方法,即计算被误判的噪声点到各个族中心的距离,把该噪声点归入距离最近的族.同时,算法采用轮廓系数来衡量算法的聚类质量.实验结果表明该算法相比原始的DBscan聚类算法具有更好的执行效率和聚类质量.

     

    Abstract: An improved DBscan clustering algorithm is proposed. The improved algorithm based on the following two points:(1) Due to Core Point that is selected by randomly based DBscan algorithm leads to the disadvantage of large computation,and puts forward a method of selecting Core Point based the farthest distance and points in ε distance are more thanMinptspoints. (2) Because parameters of εandMinpts are global uniqueness leads to shortcomings of poor of clustering quality, puts forward a method of Second Clustering, Calculating the distance between cluster center and the noise points have been wrongly selected, the noise points shall be inserted the nearest cluster. At the same time, the quality of clustering is measured by using the silhouette coefficient. The experimental results show that:Compared with the original DBscan clustering algorithm the algorithm has better performance efficiency and clustering quality.

     

/

返回文章
返回