Author : Ms.S.M.Durge 1
Date of Publication :20th April 2017
Abstract: The Clustering is an unsupervised method to divide data into disjoint subsets with high intra-cluster similarity and low inter-cluster similarity. Most of the approaches perform web documents clustering, i.e., they assign each object to precisely one of a set of clusters. Objects in one cluster are similar to each other. The similarity between objects is based on a measure of the distance between them.This works well when clustering the compact and well-separated groups of data, but in many situations, clusters are different at rerun. This proposed method usek-means++ algorithm,is capable of identifying problem by spreading the initial centers evenly and improves performance
Reference :
-
- Adrian Pusztat, Janos Sziilet and SandorLaki, “Near Real-Time Thematic Clustering of Web Documents and Other Internet contents”, 4th IEEE International Conference on Cognitive Info-communications, pp. 307-312, DEC. 2-5 2013
- EhsanElhamifar, and Rene Vidal, “Sparse Subspace Clustering: Algorithm, Theory, and Applications”, IEEE Transactions on Pattern Analysis and Machine Intelligence, VOL. 35, NO. 11, pp. 2765-2781, NOV. 2013
- X. Chen, X. Xu, J. Zhexue Huang and Yunming Ye, “TW-k-Means: Automated Two-Level Variable Weighting Clustering Algorithm for Multi view Data”, IEEE Transactions on Knowledge and Data Engineering, VOL. 25, NO. 4, pp. 932-944, APRIL 2013.
- Caimei Lu, Xiaohua Hu and Jung-ran Park, “Exploiting the Social Tagging Network for Web Clustering”, IEEETransaction on Systems, Man and Cybernetics, VOL.41, NO. 5, pp. 840-852, SEP. 2011.
- Masayuki Okabe and Seiji Yamada, “Graphcut based Iterative Constrained Clustering”, 2011 IEEE/WIC/ACM International Conferences on Web Intelligence and Intelligent Agent Technology, pp. 126-129, NOV. 2011.