殷君伟, 陈建明, 薛百里, 张健. 一种基于排序划分的聚类初始化方法[J]. 微电子学与计算机, 2013, 30(6): 80-83,87.
引用本文: 殷君伟, 陈建明, 薛百里, 张健. 一种基于排序划分的聚类初始化方法[J]. 微电子学与计算机, 2013, 30(6): 80-83,87.
YIN Jun-wei, CHEN Jian-ming, XUE Bai-li, ZHANG Jian. An Initialization Method for Clustering Center Based on Sorting and Partition[J]. Microelectronics & Computer, 2013, 30(6): 80-83,87.
Citation: YIN Jun-wei, CHEN Jian-ming, XUE Bai-li, ZHANG Jian. An Initialization Method for Clustering Center Based on Sorting and Partition[J]. Microelectronics & Computer, 2013, 30(6): 80-83,87.

一种基于排序划分的聚类初始化方法

An Initialization Method for Clustering Center Based on Sorting and Partition

  • 摘要: k-means聚类算法,是在d维空间Rd里把n个数据对象划分为K个类,其划分原则是计算每个数据对象与K个聚类中心的距离并将其分配到最近的一个类.传统直接k-means算法是随机选取初始中心的,不同的初始中心会产生不同的聚类结果,针对这个不足,提出了一种基于排序划分的聚类初始化方法,该方法简单易于实现,将其应用在真实数据集和模拟数据集上,实验表明在处理非高维数据上这是一种简单而有效的方法,在很大程度上提高了聚类精度和效率.

     

    Abstract: In k-means clustering, we are given a set of n data points in d-dimensional space Rd and an integer K the problem is to determine a set of K points in Rd, called centers, so as to minimize the mean squared distance from each data point to its nearest center. The initial centers of direct k-means algorithm are chosen randomly, different initial centers will lead to different results. In this paper, in view of the deficiency of direct k-means algorithm, we propose a novel method about initial centers based on sorting and partition and apply it to real data as well as simulated data, which show that this is a simple and efficient method to improve the clustering accuracy and efficiency.

     

/

返回文章
返回