Author : A.Avinash Goud 1
Date of Publication :7th March 2016
Abstract: The paper presents a novel algorithm for performing k-means clustering. It organizes all the patterns in a k-d tree structure such that one can find all the patterns which are closest to a given prototype efficiently. The main intuition behind the approach is as follows. All the prototypes are potential candidates for the closest prototype at the root level. However, for the children of the root node, may be able to prune the candidate set by using simple geometrical constraints. This approach can be applied recursively until the size of the candidate set is one for each node. Experimental results demonstrate that the scheme can improve the computational speed of the direct k-means algorithm by an order to two orders of magnitude in the total number of distance calculations and the overall time of computation.
Reference :
-
- K. Alsabti, S. Ranka, and V. Singh. An Efficient KMeans Clustering Algorithm. ttp:// www. cise. ufl. edu / ranka/, 1997
- T. H. Cormen, C. E. Leiserson, and R. L. Rivest. Introductionto Algorithms. McGraw-Hill Book Company, 1990.
- R. C. Dubes and A. K. Jain. Algorithms for Clustering Data. Prentice Hall, 1988.
- M. Ester, H. Kriegel, J. Sander, and X. Xu. A DensityBased Algorithm for Discovering Clusters in Large Spatial Databases with Noise. Proc. of the 2nd Int’l Conf. on Knowledge Discovery and Data Mining, August 1996.
- M. Ester, H. Kriegel, and X. Xu. Knowledge Discovery in Large Spatial Databases: Focusing Techniques for Efficient Class Identification. Proc. of the Fourth Int’l. Symposium on Large Spatial Databases, 1995.
- J. Garcia, J. Fdez-Valdivia, F. Cortijo, and R. Molina. Dynamic Approach for Clustering Data. Signal Processing, 44:(2), 1994.
- D. Judd, P. McKinley, and A. Jain. Large-Scale Parallel Data Clustering. Proc. Int’l Conference on Pattern Recognition, August 1996
- L. Kaufman and P. J. Rousseeuw. Finding Groups in Data:an Introduction to Cluster Analysis. John Wiley & Sons, 1990.
- K. Mehrotra, C. Mohan, and S. Ranka. Elements of Artificial Neural Networks. MIT Press, 1996.
- R. T. Ng and J. Han. Efficient and Effective Clustering Methods for Spatial Data Mining. Proc. of the 20th Int’l Conf. on Very Large Databases, Santiago, Chile, pages 144– 155, 1994.
- V. Ramasubramanian and K. Paliwal. Fast KDimensional Tree Algorithms for Nearest Neighbor Search with Application to Vector Quantization Encoding. IEEE Transactions on Signal Processing, 40:(3), March 1992
- E. Schikuta. Grid Clustering: An Efficient Hierarchical Clustering Method for Very Large Data Sets. Proc. 13th Int’l. Conference on Pattern Recognition, 2, 1996.
- J. White, V. Faber, and J. Saltzman. United States Patent No. 5,467,110. Nov. 1995
- T. Zhang, R. Ramakrishnan, and M. Livny. BIRCH: An Efficient Data Clustering Method for Very Large Databases. Proc. of the 1996 ACM SIGMOD Int’l Conf. on Management of Data, Montreal, Canada, pages 103–114, June 1996