An Initialization Method for Clustering Center Based on Sorting and Partition
-
Abstract
In k-means clustering, we are given a set of n data points in d-dimensional space Rd and an integer K the problem is to determine a set of K points in Rd, called centers, so as to minimize the mean squared distance from each data point to its nearest center. The initial centers of direct k-means algorithm are chosen randomly, different initial centers will lead to different results. In this paper, in view of the deficiency of direct k-means algorithm, we propose a novel method about initial centers based on sorting and partition and apply it to real data as well as simulated data, which show that this is a simple and efficient method to improve the clustering accuracy and efficiency.
-
-