A new hierarchical clustering algorithm is proposed. First, the data set is sampled, and the data points in the neighborhood are absorbed around the sampling point to form subclusters. Then, hierarchical clustering is implemented based on whether the subclusters intersect. In the hierarchical clustering process, the distance metric between clusters is redefined, and a heap structure is established based on this. By using the idea of estimating the overall distribution of data points, it is proved that the algorithm will approach the optimal solution. Experimental results show that the clustering effect of the algorithm is much better than that of existing clustering algorithms. Keywords clustering; data mining; pattern recognition; distribution Abstract A novel agglomerative method is proposed. This algorithm consists of three steps, first samples the dataset, then form the subcluster by absorbing the points in the neighborhoods of sample points, at last final clusters are constructed by combining the subclusters. The distance measure of two clusters is redefined. Based on this concept, heap structure is constructed. Formally a theoretical explanation of the algorithm is given using the method approaching the actual distribution. Experimental results show the quality of ADA is much better than very many well-known algorithm CURE.Key words clustering; data mining; pattern recognition; distribution
You Might Like
Recommended ContentMore
Open source project More
Popular Components
Searched by Users
Just Take a LookMore
Trending Downloads
Trending ArticlesMore